6.4. Spectral leakage and windowing#

In this section, we’ll take a more in-depth look at how the DFT responds to non-analysis frequencies, and how we might counteract these effects.

In an earlier section, we’ve seen that waves at non-analysis frequencies can produce sharp discontinuities at the boundaries of the signal (Fig. 6.2). If we were to listen to a signal like this on a repeating loop, these discontinuities would become audible as clicks or pops, not dissimilar to the sound of an impulse.

To illustrate the point, here we’ll construct a pure sinusoid at \(f = 111\), with \(f_s = 1000\), and sample for 1/4 second (\(N= f_s/4 = 250\)). This produces the following wave:

../../_images/d25217db91f437d23956d09f1b331cfffb8d17433caa93f89f61dc5aeb1e2e67.svg

Fig. 6.7 Top: A wave \(x[n]\) at frequency \(f=79\), with sampling rate \(f_s=1000\) and duration of 1/4 second (\(N=f_s/4\)). \(f=79\) is not an integer multiple of \(f_s/N = 4\), so it is not an analysis frequency. Note the large gap between samples when the signal repeats at \(t=0.25\). Bottom: The DFT magnitude spectrum \(|X[m]|\), displayed on a logarithmic amplitude scale.#

Looping this sample 8 times produces a signal of duration \(N= 2\cdot f_s = 2\) seconds, with audible glitches every 1/4 second:

To understand this behavior in the frequency domain, it can be helpful to think about the DFT of a unit impulse or delay signal (see Section 5.6.1). Impulses—and more generally, sharp discontinuities—are not easy to express with smoothly varying sinusoids, and this is why it takes the entire set of analysis frequencies to represent a delay signal (all \(\darkblue{X[m]}\) are non-zero in Fig. 5.15).

A similar thing happens to non-analysis frequencies: to explain the discontinuity at the boundary of the signal, the DFT uses the entire frequency spectrum, again producing non-zero DFT magnitudes \(|\darkblue{X[m]}|\) across the frequency spectrum. This phenomenon is called spectral leakage: the energy associated with our non-analysis frequency \(f\) has “leaked” over the entire spectrum.

The bad news is that spectral leakage cannot be avoided in general. The energy in a signal associated with each frequency has to go somewhere in the DFT, and if the frequency does not correspond to one of our analysis frequencies, then it will spread out.

The good news is that we can, to some extent, control leakage to direct the leaked energy in various ways. This is accomplished by a technique called windowing.

6.4.1. Windowing#

The idea of windowing a signal is to force continuity at the boundaries, prior to performing the DFT. The simplest way to achieve this is by multiplying the signal \(\blue{x[n]}\) by another signal \(\red{w[n]}\) of the same duration, such that \(\red{w[0] \approx w[N-1]} \approx 0\), resulting in the windowed DFT \(\darkblue{\hat{X}}\):

\[\darkblue{\hat{X}} \leftarrow \text{DFT}(\blue{x[n]} \cdot \red{w[n]})\]

For example, Fig. 6.9 illustrates this using what’s known as a Hann window.

illustration of spectral leakage with and without windowing

Fig. 6.9 A sinusoid \(\blue{x[n]}\) at a non-analysis frequency (upper-left, \(f=2.25\), \(N=128\), \(f_s=64\)) produces spectral energy at all frequencies in the DFT (original spectrum, bottom-right). Multiplying the sinusoid by a window function \(\red{w[n]}\) (bottom-left) tapers the signal values to 0 at the beginning and end, and reduces spectral leakage in the DFT magnitude spectrum \(\darkblue{|\hat{X}|}\) (windowed spectrum, bottom-right).#

Taking the DFT after windowing the signal significantly reduces the magnitude of components \(\darkblue{\hat{X}[m]}\) when \(m\) is far from the true frequency of the input signal, but retains energy for \(m\) close to the true frequency.

If we loop the windowed signal, rather than the original signal, we’ll see that the boundary discontinuities have vanished.

a signal repeating with and without windowing

Fig. 6.10 Top: repeating a sinusoid \(\blue{x}\) at a non-analysis frequency produces discontinuities when the signal repeats (\(t=2, 4\)). Bottom: windowing the sinusoid prior to repetition eliminates discontinuities, but introduces low-frequency modulation.#

As shown in Fig. 6.10, the elimination of discontinuities does change the signal: we have also introduced a low-frequency amplitude modulation to the looped signal.

This effect is certainly audible. If we apply windowing to the previous example (\(f=111\) Hz sampled at \(f_s=1000\)), and listen to the repeating signal as before, each loop of the signal now has a pulse due to the window.

In practice, one would not typically use windowing in this fashion, but it is helpful to listen to the looped signal to get a better intuition for what the signal looks like to the DFT. In this example, we have traded off transient discontinuities for a smoothly varying pulse, which can be more easily modeled by sinusoids.

6.4.2. Choosing a window#

In our first example, we used a Hann window, which is essentially a carefully tuned cosine wave, but there are many, many, many other options. Most window functions have non-linear curves, and often end up resembling a “bell curve”.

We won’t go into the details of how each of these window functions are defined, but Fig. 6.11 demonstrates a handful of commonly used windows.

visual comparison of several common window functions

Fig. 6.11 Different choices of window function \(\red{w}\) and the corresponding DFT magnitude spectrum \(|\hat{X}|\) after applying each window to a sinusoid at non-analysis frequency \(f=2.25\) Hz.#

Fig. 6.11 demonstrates two key properties of windowing functions. First, different window functions will attenuate distant frequencies differently. The height of the spectrum in the last plot (Blackman-Harris) is around \(10^{-4}\), while the Hamming window is approximately 100x higher at \(10^{-2}\). From this, we might conclude that the Blackman-Harris window is “better” than the Hamming window, but we shouldn’t be too hasty.

The second property has to do with how much the energy spreads around its peak in the spectrum, the so-called “main lobe” width. From this perspective, the Blackman-Harris window has a broader main-lobe (around the peak frequency 2.25) than the other windows, so it might not be the best choice if our goal is to distinguish between nearby frequencies.

Tip

As a general rule, the Hann window is a good default choice for most audio applications.

6.4.2.1. What about analysis frequencies?#

All of this was motivated by the problem of transients being induced by looping signals with non-analysis frequencies. In general, we won’t know if a signal contains non-analysis frequencies, so it’s natural to ask what would happen if we apply windowing in general? What happens to signals that actually do contain analysis frequencies?

Fig. 6.12 illustrates exactly the same comparisons as above, but now using an input signal \(x\) generated by a sinusoid at an analysis frequency.

comparison of different window functions when applied to a sinusoid at an analysis frequency

Fig. 6.12 Different choices of window function \(w\) and the corresponding DFT magnitude spectrum after applying each window to a sinusoid at analysis frequency \(f=3\) Hz.#

The spectral plots in Fig. 6.12 illustrate another price that we must pay to counteract leakage: if we do have analysis frequencies in the signal, applying windowing will spread some of their energy across the spectrum. Although in all cases, the energy far from the fundamental frequency (\(f=3\)) is small (numerically close to 0), the energy in the windowed signals is dispersed around the peak, rather than being concentrated like in the un-windowed case.

One way to view this trade-off is that windowing reduces the distinction between analysis and non-analysis frequencies: both end up leaking across the spectrum, but the choice of window function allows us to control this behavior. In reality, almost no naturally occurring signals will line up precisely to the parameters of your signal analysis, so it’s safer to assume that all energy is coming from non-analysis frequencies anyway.

6.4.3. Windowing in practice#

Most signal processing frameworks provide a library of pre-defined window functions. In Python, these are provided by the function scipy.signal.get_window. To use a windowing function as we did in the example above, one first constructs the window of a given length, and then applies the DFT:

# We'll assume the input signal x already exists, and get its length
N = len(x)

# Build the window
w = scipy.signal.get_window('hann', N)

# Multiply by w and take the DFT
X = np.fft.rfft(x * w)

6.4.4. Summary#

In this section, we’ve seen that windowing can help reduce the bad effects of spectral leakage. This doesn’t necessarily mean that windowing should always be used, however, as it does alter the content of the signal.

As we will see later on, the main application of windowing has to do with the short-time Fourier transform, where a long signal is carved into small pieces for analysis purposes.