2.3. The Nyquist-Shannon sampling theorem#

In the previous section, we saw that aliasing occurs between frequencies related by an integer multiple of the sampling rate \(f_s\):

\[ f' = f + \red{k \cdot f_s}. \]

The bad news is that we can never avoid this: it’s a byproduct of representing continuous signals by discrete samples.

The good news it that if we’re careful, we can ensure that aliasing effects do not corrupt our signals (or analysis). This is what the Nyquist-Shannon theorem is all about: establishing the conditions under which sampling is okay.

2.3.1. Sampling pure tones and combinations#

The sampling theorem is most easily understood in terms of pure tones (sinusoids). While most any signal you encounter out in the world is unlikely to be a pure sinusoid, it turns out that under mild conditions, every continuous signal can be expressed as a combination of sinusoids. By combination, we specifically mean a weighted sum, possibly with different phase offsets for each frequency:

\[\begin{align*} x(t) =& A_1 \cdot \cos(2\pi \cdot f_1 \cdot t + \phi_1) \;+\\ &A_2\cdot \cos(2\pi \cdot f_2\cdot t + \phi_2) \;+\\ &A_3\cdot \cos(2\pi \cdot f_3\cdot t + \phi_3) + \cdots \end{align*}\]

When we sample a signal to produce \(x[n]\), we can equivalently think of sampling each sinusoid first and then summing the results:

\[\begin{align*} x[n] =& A_1\cdot \cos\left(2\pi \cdot f_1 \cdot \frac{n}{f_s} + \phi_1\right) \;+\\ & A_2\cdot \cos\left(2\pi \cdot f_2\cdot \frac{n}{f_s} + \phi_2\right) \;+\\ & A_3\cdot \cos\left(2\pi \cdot f_3\cdot \frac{n}{f_s} + \phi_3\right) + \cdots \end{align*}\]

Reasoning about sampling in this way will simplify things quite a bit. If we can understand what sampling does for pure sinusoids, then we can extend that knowledge to general signals. Note that we don’t need to know the specific values for \(A_i\) or \(\phi_i\); these quantities will generally be unknown. It suffices to know that they exist.

2.3.2. Band-limited sampling#

A signal \(x(t)\) is band-limited if it can be expressed as a combination (weighted sum) of pure sinusoids whose frequencies lie between some minimum frequency \(f_-\) and some maximum frequency \(f_+ \geq f_-\). The size of this band of frequencies,

\[ f_+ - f_- \]

is known as the bandwidth of the signal.

Another way to think of band-limiting is that any sinusoid with frequency \(f < f_-\) or \(f > f_+\) has no weight in the combination that produces \(x(t)\).

The basic idea of the Nyquist-Shannon theorem is that if the sampling rate \(f_s\) is sufficiently large (compared to the bandwidth of the signal), then aliasing can’t hurt us: aliases must have zero amplitude.

2.3.3. The Nyquist-Shannon sampling theorem#

We can now formally state the sampling theorem, commonly attributed to Harry Nyquist and Claude Shannon [Nyq28, Sha49].

We’ll actually state a simpler form of their theorem that’s sufficient for our needs.

Theorem 2.2 (Nyquist-Shannon)

If \(x(t)\) band-limited to the range \(f_- \dots f_+\), then any sampling rate \(f_s \geq f_+ - f_-\) is sufficient to prevent aliasing.

Proof. Pick any frequency \(f\), which will have aliasing frequencies of the form \(f' = f + k \cdot f_s\) for integer values \(k\). Because the space between aliases is at least \(f_s\), and the bandwidth of the signal is at most \(f_s\), any aliasing frequency \(f'\) must reside outside the frequency range of \(x(t)\) as depicted in Fig. 2.4.

An illustration of a frequency within the band limits, and its aliasing frequencies lying outside.

Fig. 2.4 The shaded region indicates frequencies within the band limits of the signal. If the sampling rate \(f_s\) is sufficiently high, then aliases of \(f\) inside the band limits must land outside.#

As a result, the discrete sampled signal \(x[n]\) will depend only on those frequencies within the band limits \(f_- < f < f_+\), which cannot be aliases of each other.

2.3.4. Band-limiting in practice#

The Nyquist-Shannon theorem tells us how to choose a sampling rate, provided we know the band limits of the signal(s) we’d like to sample. But how do we ensure that \(x(t)\) is actually band-limited?

In hardware analog-to-digital converters (ADCs), this is done by using an analog circuit to filter the continuous signal and remove any frequencies above \(f_+\) prior to sampling. If you’ve ever seen a tone knob on an electric guitar, the principle is much the same.

2.3.4.1. Setting the bandwidth#

At a first glance, the Nyquist-Shannon theorem might suggest to set \(f_- = 0\) and \(f_+\) to some reasonable maximum frequency, e.g. for audio, the upper range of human hearing (about 20000 Hz). Unfortunately, this approach won’t work.

To see why, consider the two signals plotted below: one has a frequency of 5 Hz, and the other has a frequency of -5 Hz:

\[\begin{align*} x_1(t) &= \cos\left(2\pi \cdot 5 \cdot t\right)\\ x_2(t) &= \cos\left(2\pi \cdot (-5) \cdot t\right). \end{align*}\]
Two cosine waves of opposite frequencies have identical plots

Fig. 2.5 Two cosine waves, one with positive frequency and one with negative frequency.#

For cosine waves, this is just the symmetry property from before:

\[ \cos(-\theta) = \cos(\theta), \]

which implies

\[\cos\left(2\pi \cdot (-f)\cdot t + \phi\right) = \cos\left(2\pi \cdot f \cdot t - \phi\right).\]

For sine waves, there is an anti-symmetry property, which we can combine with the phase inversion property (eq. (1.4)):

\[\sin(-\theta) = -\sin(\theta) = \sin(\theta - \pi),\]

which implies

\[\sin\left(2\pi \cdot (-f) \cdot t + \phi\right) = \sin\left(2\pi \cdot f \cdot t + \pi - \phi\right).\]

If we wanted to filter out negative frequencies (i.e., set \(f_- = 0\)), then we must necessarily also filter out positive frequencies as well, because \(f\) and \(-f\) are indistinguishable from each other. Put another way, for any frequency \(f\) that we want to keep, we must also keep \(-f\), so our band limits must be symmetric around 0.

2.3.4.2. Putting it all together#

The symmetry argument above tells us that we must have \(f_- = -f_+\). This leads to the more common formula for the sampling rate in the Nyquist-Shannon theorem:

(2.5)#\[f_s \geq 2\cdot f_+,\]

because \(f_+ - f_- = f_+ - (-f_+) = 2\cdot f_+\).

Alternatively, for a fixed sampling rate \(f_s\), the highest frequency that can be measured without aliasing artifacts is \(f_s / 2\), also known as the Nyquist frequency (for sampling rate \(f_s\)).

For audio applications, we typically want \(f_+\) to be sufficiently large to capture the audible range, which for humans, generally spans \(30\) to \(20000\) Hz. This suggests a sampling rate \(f_s \geq 2\cdot f_+ \approx 40000\). Combining this with a few various technological constraints resulted in the standard rate \(f_s = 44100\) Hz for compact disc quality audio, and which is still commonly used today.