9.3. Exercises#

Exercise 9.1

Imagine you have a signal with length \(N=44100\) sampled at \(f_s = 44100\). How many frames would you get if you take an STFT with frame length \(N_F=4096\) and hop length \(N_H=512\)?

Exercise 9.2

Sometimes, one does not want to discard any samples when performing an STFT. This can be done by padding the signal with trailing zeros. In the configuration from question 1, what is the smallest number of samples that you would need to add to capture the entire signal?

Can you give a more general form for calculating the required padding, in terms of (unknown) parameters \(N, N_F, N_H\)?

Exercise 9.3

The SciPy package provides an STFT implementation scipy.signal.stft which uses a slightly different parametrization that the one presented in this chapter.

Using a (non-trivial) test signal \(x\) of your choice, can you find parameter settings of scipy.signal.stft that produce identical output to the wstft function for \(N_F=2048\) and \(N_H=512\)?

Hint

It’s easiest to check the shape of the outputs first:

import numpy as np
import scipy

# [COPY in wstft definition from the text]

s1 = wstft(x, n_frame, n_hop, 'hann')
s2 = scipy.signal.stft(...)

# Check shapes
assert s1.shape == s2.shape

Note: you may need to transpose s2 by saying s2 = s2.T so that the time- and frequency dimensions are in the same order as ours.

After you get the shapes to line up, test for numerical equivalence by using np.allclose

assert np.allclose(s1, s2)