5.2. Comparing to sinusoids#

In this section, we’ll begin to build up the foundations of the Fourier transform and the frequency domain. As stated in the introduction to this chapter, the entire idea of the Fourier transform is to represent arbitrary signals \(x[n]\) as combinations of sinusoids:

\[\begin{align*} x[n] &= A_0 \cdot \cos\left(2\pi \cdot f_0 \cdot \frac{n}{f_s} + \phi_0\right) \\ &+ A_1 \cdot \cos\left(2\pi \cdot f_1 \cdot \frac{n}{f_s} + \phi_1\right) \\ &+ \dots \end{align*}\]

Before we go any further with this idea, it is helpful to think carefully about what it would mean to represent an arbitrary discrete signal \(x[n]\) in this way.

Warning

The representation that we develop in this section is not the Fourier transform.

This section will set up the foundations that we’ll need to define the Fourier transform later in this chapter.

5.2.1. Band-limiting#

Since \(x[n]\) is a discrete signal taken at some sampling rate \(f_s\), we will assume that the conditions of the Nyquist-Shannon theorem hold and that aliasing is not going to be a problem for us. More specifically, this means that we are assuming that the underlying continuous signal \(x(t)\) which produced \(x[n]\) is band-limited to the frequency range \(\pm \frac{f_s}{2}\). This implies that when representing \(x[n]\), we will not need to use frequencies outside of this range.

That said, there are still infinitely many frequencies in the range \([-f_s/2, f_s/2]\) (e.g., 1 Hz, 1/2 Hz, 1/4 Hz, … and we can’t compare to all of them and still have a finite representation of the signal. We’ll need to be careful in choosing which frequencies we compare to.

5.2.2. Periodicity#

Recall back in chapter 1 that periodic signals must repeat forever, and this includes the sinusoids that will form the basis of any frequency-domain representation. This also carries through to (finite) combinations of sinusoids. Any signal that we try to represent as a combination of sinusoids, therefore, must also be assumed to be periodic.

This presents a bit of a problem for us, as up to now, we’ve been assuming that signals are silent for all negative time: \(x[n] = 0\) for \(n < 0\). A (non-empty) signal with an infinitely long stretch of silence cannot be periodic, so what can we do?

To make progress here, we’ll need to change our assumptions. Rather than assume that signals are silent for all time before recording, we’ll assume that a finite signal \(x[0], x[1], \dots, x[N-1]\) represents one cycle of a loop that repeats forever. Under this assumption, once we’ve observed these \(N\) samples, there is nothing new to learn from further observation, since the repetition assumption gives us for any sample index \(n\):

(5.2)#\[x[n] = x[n + N] = x[n + 2\cdot N] = \dots = x[n + k \cdot N] \quad\quad \text{for any } k \in \mathbb{Z}.\]

This also gives a new view to negative sample indices as offsets from the end of the signal:

(5.3)#\[x[-n] = x[-n+N] = x[N-n].\]

This assumption is demonstrated visually below, where a signal \(x[n]\) of duration \(1.5\) seconds (sampled at \(f_s=16\), so \(N=24\)) is shown along with its implied repetitions. These repetitions are implied to go on forever in each direction of time. Note that the repeating signal need not be continuous at the repetition boundaries: all that matters is that the observed signal repeats exactly.

illustration of the repeating signal assumption

Fig. 5.3 A signal \(x[n]\) of some finite number of samples \(N\) (highlighted region) is assumed to repeat forever. Vertical lines denote the boundaries of each repetition of \(x\), additional repetitions are depicted in the shaded region.#

5.2.3. Choosing reference frequencies#

As mentioned above, even though we’re assuming \(\blue{x[n]}\) represents a band-limited signal, there are still infinitely many frequencies within the band. Our periodicity assumption will help us narrow this down to a finite set that we can use to form our reference signals.

In the plot above, we see several identical repetitions of the observed signal \(\blue{x[n]}\). Imagine comparing the full repeating signal to a sinusoid of some frequency \(\red{f}\), and computing the similarity score independently for each individual repetition. Ideally, it should not matter which repetition of \(\blue{x}\) we look at: they should all produce the same similarity score. Unfortunately, this is not the case.

For example, if we take \(\red{f}=1~[\text{Hz}]\) to get a reference signal

\[\red{y[n] = \cos\left(2 \pi \cdot \frac{n}{f_s}\right)}\]

we would observe the following:

A repeating signal, a reference sinusoid, and the resulting similarity scores for each repetition

Fig. 5.4 Top: a repeating signal \(\blue{x}\), with one observed repetition highlighted. Middle: a reference sinusoid \(\red{y}\). Bottom: each repetition of \(\blue{x}\) produces a different similarity score when compared to the corresponding segment of the reference signal \(\red{y}\).#

Each repetition of the signal \(\blue{x[n]}\) produces a different similarity score when compared to the 1 Hz reference signal \(\red{y[n]}\), alternating between \(\purple{S=+3.72}\) and \(\purple{S=-3.72}\). As a result, it’s unclear how we should define the similarity between \(\blue{x}\) and \(\red{y}\): if it’s the similarity between one repetition, which one do we choose? Or do we combine scores from multiple repetitions, and if so, how would that work if they go on forever? In this example, we only had two distinct score values to deal with, but it’s possible to construct examples where there are many more than that.

This problem arises because the period of the reference signal (1 second) does not line up with the period \(x\), so the reference signal appears to start at a different point within each repetition. In the figure, we observe the reference signal starting either high (1) or low (-1) in alternating repetitions of \(x\).

If we can ensure that the reference signal always starts at the same position in each repetition of \(x\), then the problem vanishes: each repetition will produce the same similarity score, so we only need to examine one of them. This observation leads us to the concept of an analysis frequency.

Definition 5.1 (Analysis frequency)

For a sampling rate \(f_s\) and a signal length \(\blue{N}\), an analysis frequency \(\red{f_m}\) is any frequency which exactly completes a whole (integer) number of cycles \(\red{m}\) in \(\blue{N}\) samples at \(f_s\).

Equivalently:

(5.4)#\[\begin{align*} \red{f_m} &= \frac{\red{m}}{\blue{N}} \cdot f_s \left[\frac{\text{cycles}}{\text{sec.}} \right] & \quad\quad m = 0, 1, 2, \dots \end{align*}\]

Since we will be sampling waves at these frequencies at rate \(f_s\) over the full signal duration \(\blue{N}\), it is sometimes more useful to think of analysis frequencies in units of [cycles / signal-duration], or in the notation above, the ratio \(\red{m}/\blue{N}\).

Analysis frequencies have exactly the characteristic that we want: a shift by \(\blue{N}\) samples will leave the signal unchanged, so each repetition of \(\blue{x}\) is guaranteed to produce the same similarity score. For example, if we had chosen \(\red{m=1}\) (one cycle per signal duration) so that \(\red{f} = \red{1}/\blue{N} \cdot f_s\) (1.5 Hz for this case), we’d observe the following result:

a repeating signal, a reference sinusoid at an analysis frequency, and the resulting similarity scores for each repetition

Fig. 5.5 If the reference sinusoid corresponds to an analysis frequency, each repetition of \(x\) will produce the same similarity score.#

Since we are now guaranteed that each repetition produces the same similarity score, we only need to look at one of them.

One is much less than infinity, so we’re making progress.

5.2.3.1. How many distinct analysis frequencies are there?#

Now that we’ve settled on which frequencies we’ll be comparing our signal to, the next natural question is how many analysis frequencies are there? Since each analysis frequency will give us a different similarity score, the number of frequencies determines the size of the representation of our signal. Our goal will be to find a minimal set of analysis frequencies which can accurately represent the signal.

As defined above, each analysis frequency is determined by a non-negative integer \(\red{m}\) which counts how many full cycles (positive or negative) are completed over the duration \(\blue{N}\) samples. There are, of course, infinitely many values for \(\red{m}\), but let’s see what happens.

For an index \(\red{m}\), we have analysis frequency \(\red{f_m} = \red{m}/\blue{N} \cdot f_s\), and the samples of the corresponding reference signal \(\red{y_m}\) (using cosine waves) will be determined by:

(5.5)#\[\red{y_m[n]} = \cos\left(2\pi \cdot \frac{\red{m}}{\blue{N}}\cdot f_s \cdot \frac{n}{f_s} \right) = \cos\left(2\pi \cdot \frac{\red{m}}{\blue{N}} \cdot n \right).\]

The plots below illustrate the sampled reference signals for the first few analysis frequencies with \(f_s = 20\) and \(\blue{N} = 30\) (1.5 seconds):

visualization of four analysis frequencies

Fig. 5.6 The first four analysis frequencies for a signal of duration 1.5 seconds.#

If we continue this process a little further (\(\red{m}= \dots, N-2, N-1, N, N+1, \dots\)), what happens?

Let’s take \(\red{m}=N+1\) as an example. In this case, we have

\[\begin{align*} \red{f_{N+1}} &= \frac{\red{N+1}}{\blue{N}} \cdot f_s\\ &= \frac{\red{N}}{\blue{N}}\cdot f_s + \frac{\red{1}}{\blue{N}} \cdot f_s\\ &= f_s + \frac{\red{1}}{\blue{N}} \cdot f_s\\ &= f_s + \red{f_1}. \end{align*}\]

This is our old friend, aliasing. Analysis frequency \(\red{m}=\blue{N}+1\) gives us exactly the same samples as \(\red{m=1}\) because they are related by the bandwidth of the signal \(f_s\). Comparing our signal \(\blue{x}\) to the reference signal for \(\red{m}=\blue{N}+1\) would therefore give us exactly the same similarity score as if we had compared to the \(\red{m=1}\) signal. So let’s not bother with it.

More generally, for any \(\red{m} \geq \blue{N}\), the corresponding frequency will always have an alias given by \(\red{m} \mod \blue{N}\). This answers our question: for a signal of \(\blue{N}\) samples, there are only \(\blue{N}\) such frequencies (\(\red{m}=0,1,2, \dots, N-2, N-1\)) which produce distinct reference signals. How fortunate that this is exactly the same number as the length of our signal.

5.2.3.2. A closer look at analysis frequencies#

Now that we have our analysis frequencies chosen, it’s worth taking a closer look to make sure we really understand what’s going on. The plot above shows the first few (\(\red{m=0,1,2,3}\)) but what happens for larger values?

Recall that we’ve assumed the signal \(\blue{x[n]}\) to be band-limited to satisfy the Nyquist-Shannon theorem, so it cannot contain energy from frequencies above \(f_s/2\) (the Nyquist frequency). The Nyquist frequency appears in our analysis frequencies (when \(\blue{N}\) is even, anyway) as \(\red{m}=\blue{N}/2\):

\[\red{f_{N/2}} = \frac{N}{2N} \cdot f_s = \frac{f_s}{2},\]

but \(\red{m}\) itself can range all the way up to \(N-1 > N/2\).

Again, aliasing can help us make sense of this. When \(N-1\geq \red{m} > N/2\), the corresponding frequency will lie outside the band limits of the signal. However, it will have an alias within the band, which we can find by subtracting \(f_s\):

\[\begin{align*} \red{f_m} - f_s &= \frac{\red{m}}{\blue{N}} \cdot f_s - f_s\\ &= \frac{\red{m}}{\blue{N}} \cdot f_s - \frac{\blue{N}}{\blue{N}} \cdot f_s\\ &= \frac{\red{m-N}}{\blue{N}} \cdot f_s. \end{align*}\]

Since \(\red{m} < \blue{N}\), this will be a negative frequency, but that’s okay. The important thing is that it will be within the range \([-f_s/2, +f_s/2]\).

Tip

Analysis frequencies for \(\red{m} \geq \blue{N/2}\) correspond to negative frequencies.

So far in this section, we’ve been assuming that the reference signals are produced by cosine waves, for which negative frequencies are indistinguishable from positive frequencies. (Remember, \(\cos(\theta) = \cos(-\theta)\).) This means that the distinction isn’t so important right now. However, it will become important in the full context of the Fourier transform, which we will see in the following sections.

5.2.4. Bringing it all together#

To finish off this section, let’s combine everything we’ve seen so far, and finally compare a signal \(\blue{x[n]}\) to a full collection of reference signals. For each analysis frequency \(\red{f_m}\), we’ll compute a similarity score \(\purple{S_m}\), and the collection of scores \(\purple{[S_0, S_1, \dots, S_{N-1}]}\) will form the representation of our signal. We’ll use the same example signal as above, but we’ll now focus on a single repetition of \(\blue{x[n]}\) rather than the infinitely repeating sequence.

The plot below shows the signal \(\blue{x}\), reference signals \(\red{y_m}\) for \(\red{m = 0, 1, 2, \dots, 9}\) in the left column. The middle column shows the product of \(\blue{x[n]} \cdot \red{y_m[n]}\) for each reference signal \(\red{y_m}\), and the right column shows the resulting similarity scores.

a signal compared to ten reference sinusoids and the resulting similarity scores

Fig. 5.7 Left: a signal \(\blue{x}\) overlaid with the first 10 reference sinusoids \(\red{y_m}\). Middle: the per-sample product. Right: the resulting similarity scores for each reference sinusoid.#

Showing all \(N=150\) comparisons in the form above would take quite a bit of space, so we only show the first ten to illustrate the idea.

More frequently, we omit the center column, and go directly to the third column. For our signal, if we show all \(\blue{N}\) similarity scores in one plot, it looks like this:

visualization of similarity scores collected into a spectrum plot

Fig. 5.8 A bar-plot illustrating the similarity score for the input signal compared to the full set of analysis frequencies. Indices \(m = 0, 1, \dots N/2\) correspond to positive frequencies, and \(m \geq N/2\) (shaded region) correspond to negative frequencies.#

This kind of plot is called a spectrum. By analogy to colors, you can think of the length of each bar as measuring “how much” of each analysis frequency is present in the signal \(\blue{x[n]}\). You may have encountered something like this previously, for example, in a music player application, digital audio workstation, or on a stereo system.

We’ll have much more to say about frequency spectra in the following sections, but for now we’ll leave it here.

5.2.5. Summary#

As mentioned at the top of this section, the process that we’ve just seen is not quite the Fourier transform, but we’re almost there.

The key things to remember from this section:

  1. We now assume that a signal \(\blue{x[n]}\) of \(\blue{N}\) samples represents one observation of an infinitely repeating loop.

  2. Analysis frequencies complete a whole number of cycles in \(\blue{N}\) samples.

  3. We can represent a signal \(\blue{x}\) according in terms of its similarity to sinusoids at our chosen analysis frequencies.