12.4. Stability, poles, and zeros#

Most of the filters we’ve encountered so far have been stable, in the sense that if an input signal \(\blue{x[n]}\) has bounded values, so too will the output \(\purple{y[n]}\).

As an example of an unstable system, consider

(12.9)#\[\purple{y[n]} = \blue{x[n]} + 2\cdot \purple{y[n-1]}.\]

Plugging \(\blue{x[n] = [1, 0, 0, 0, \dots]}\) into this definition produces the impulse response

\[\begin{align*} \purple{y[0]} &= \blue{x[0]} + 2\cdot \purple{0} &= 1\\ \purple{y[1]} &= \blue{x[1]} + 2\cdot \purple{y[0]} &= 2\\ \purple{y[2]} &= \blue{x[2]} + 2\cdot \purple{y[1]} &= 4\\ &\dots\\ \purple{y[n]} &= 2^n \end{align*}\]

which grows exponentially and without bound. This system is unstable.

While it is relatively straightforward to determine that (12.9) is unstable, it is not always so easy. This leads us to the question: how can we determine in general whether a given system is stable?

12.4.1. Transfer functions and polynomials#

Recall that for a linear IIR filter with feed-forward coefficients \(\red{b[k]}\) and feed-back coefficients \(\cyan{a[k]}\), the transfer function is given by the ratio of the z-transform of sequences \(\red{b[k]}\) and \(\cyan{a[k]}\):

\[\red{H(z)} = \frac{\red{B(z)}}{\cyan{A(z)}} = \frac{\displaystyle\sum_{k=0}^{K-1} \red{b[k]} \cdot z^{-k}}{\displaystyle\sum_{k=0}^{K-1} \cyan{a[k]} \cdot z^{-k}}.\]

More specifically, \(H(z)\) is the result of dividing two polynomials \(\red{B(z)}\) and \(\cyan{A(z)}\):

\[\begin{align*} \red{B(z)} &= \red{b[0]} \cdot z^0 + \red{b[1]} \cdot z^{-1} + \red{b[2]} \cdot z^{-2} + \dots + \red{b[K-1]} \cdot z^{-(K-1)}\\ \cyan{A(z)} &= \cyan{a[0]} \cdot z^0 + \cyan{a[1]} \cdot z^{-1} + \cyan{a[2]} \cdot z^{-2} + \dots + \cyan{a[K-1]} \cdot z^{-(K-1)}. \end{align*}\]

Strictly speaking, these are polynomials in \(z^{-1}\) (since the powers have negative exponents instead of positive exponents), but the distinction is unimportant for our purposes.

It turns out that we can learn quite a bit about a filter’s behavior by examining the properties of these polynomials, and specifically, finding their roots by solving \(\red{B(z) = 0}\) and \(\cyan{A(z) = 0}\).

12.4.1.1. Roots and algebra#

From the fundamental theorem of algebra, we have that any polynomial

\[Q(z) = \sum_{d=0}^{D} q[d] \cdot z^d\]

has \(D\) (complex) roots, that is, solutions to the equation \(Q(z) = 0\).

For example, the polynomial \(2x^2 - x - 1\) has roots \(x=1\) and \(x=-1/2\). These roots can be found by factoring the polynomial:

\[2x^2 - x - 1 = (2x + 1) \cdot (x - 1)\]

and then setting each factor to 0 independently (since either being 0 is sufficient to ensure the product is 0):

\[\begin{align*} 2x+1 = 0 &\Rightarrow x = -\frac{1}{2}\\ x-1 = 0 &\Rightarrow x = 1\\ \end{align*}\]

As a second example, consider

\[x^2 - 6x + 9 = 0.\]

This also is a degree-2 polynomial, so it should have two solutions. However, if we factor the polynomial, it turns out that the solutions are not unique:

\[x^2 - 6x + 9 = (x-3)\cdot(x-3).\]

In this case, we say that there is one solution \(x=3\) of multiplicity 2 (because it occurs twice).

The fundamental theorem of algebra does not require that all solutions be unique.

As a final example, take

\[x^3 + 1 = 0\]

or equivalently, \(x^3 = -1\). In this case, there is only one real-valued solution (\(x=-1\)), but there are two more complex-valued solutions: \(x = e^{\mathrm{j}\cdot \pi / 3}\) and \(x = e^{-\mathrm{j}\cdot \pi / 3}\).

The fundamental theorem of algebra allows for roots to be complex.

12.4.2. Zeros#

For the moment, let’s focus on the numerator of the transfer function, i.e., the polynomial \(\red{B(z)}\) derived from the feed-forward coefficients. By definition, if \(z_0\) is a root of \(\red{B(z)}\), then \(\red{B(z_0) = 0}\). As long as \(z_0\) is not also a root of \(\cyan{A(z)}\), we will have

\[\red{H(z_0)} = \frac{\red{B(z_0)}}{\cyan{A(z_0)}} = \frac{\red{0}}{\cyan{A(z_0)}} = 0.\]

The location in the complex plane of the zeros (roots of \(\red{B(z)}\)) tells us about which frequencies are attenuated by the filter. This is because polynomials are continuous: if \(\red{B(z_0) = 0}\), then other values of \(z\) near \(z_0\) will be small (\(\red{B(z) \approx 0}\)).

Before going further, it may be helpful to visualize what we’re talking about. To make things concrete, we’ll analyze a Type-2 Chebyshev filter constructed by the following code block:

fs = 8000  # Sampling rate
fc = 1000  # Cutoff frequency
fstop = 1500 # Stop-band 1KHz above cutoff
attenuation = 80  # we'll require 80 dB attenuation in the stop band
ripple = 6  # we'll allow 6 dB ripple in the passband

# Compute the order for this filter, which in this case is 9
order, wn = scipy.signal.cheb2ord(fc, fstop, ripple, attenuation, fs=fs)
b, a = scipy.signal.cheby2(order, attenuation, wn, fs=fs)

Once we have the filter coefficients b and a, we can obtain the zeros of the filter by the following code:

zeros, _, _ = scipy.signal.tf2zpk(b, a)

The function tf2zpk takes in a transfer function (represented by b and a) and returns the zeros (z), poles (p), and gain (k) of the filter. We’ll come back to poles and gain later on, so for now we’ll retain just the zeros.

If we print the zeros, we’ll see a sequence of 9 complex numbers:

array([ 0.38141995-0.92440187j,  0.26658886-0.96381034j,
       -0.02490664-0.99968978j, -0.57559933-0.81773187j,
       -0.57559933+0.81773187j, -0.02490664+0.99968978j,
        0.26658886+0.96381034j,  0.38141995+0.92440187j,
       -1.        +0.j        ])

We can then plot these values in the complex plane, and compare them to the frequency response of the filter (generated by freqz as in the previous section).

Zeros of an order-9 Chebyshev (II) filter

Fig. 12.5 Left: The zeros of an order-9 Type-2 Chebyshev filter with cutoff frequency \(\purple{f_c = 1000}\), sampling rate \(f_s=8000\), transition bandwidth of 500 Hz, passband ripple of 6dB, and stop-band attenuation of 80dB. The angle \(\purple{\theta = 2\pi / 8}\) corresponds to the cutoff frequency divided by the sampling rate \(\purple{f_c} / f_s = 1000 / 8000\). All zeros of this filter have unit magnitude, and come in conjugate pairs: 4 with positive imaginary component, 4 with negative, and one with zero (at Nyquist). Right: The frequency response curve of this filter \(|H(z)|\). Each zero in the left plot corresponds to a particular frequency, which coincide with the minima of the frequency response curve as indicated by the dashed lines.#

From Fig. 12.5, we can observe that the minima in the frequency response curve line up exactly with the zeros of the transfer function. More generally, frequencies near the zeros are also attenuated. You can think of the zeros as physical weights that drag down the frequency response.

This is only part of the story though: we’ll also need to look at the feedback coefficients to get the full picture.

12.4.3. Poles#

So far, we’ve found that the zeros of \(\red{B(z)}\) can tell us about the frequency response of a filter. But what about the zeros of \(\cyan{A(z)}\)?

The roots of \(\cyan{A(z)}\) are known as poles of the filter, and are denoted by \(p_0, p_1, \dots, p_{K-1} \in \mathbb{C}\). For any pole \(p\) (that is not also a root of \(\red{B(z)}\)), \(\red{H(p)} = \red{B(p)} / \cyan{A(p)}\) would divide by zero. Without getting too far into the technical details, it is possible to make sense of this division by looking at the behavior of \(\red{H(z)}\) in a neighborhood around a pole.

The intuition behind the name “pole” derives from the idea of the function \(|\red{H(z)}|\) (the magnitude of \(\red{H(z)}\)) can be thought of as a sheet draped over the complex plane, which is held up by (tent) poles. A simplified illustration of this is provided below in Fig. 12.6.

illustration of a function with an asymptote toward positive infinity at x=2

Fig. 12.6 The function \(y = \frac{1}{(x-2)^2}\) has a pole at \(x=2\). As \(x\) approaches 2 from either direction, \(y\) increases.#

If the zeros of \(\red{H(z)}\) tell us where frequencies are attenuated, the poles tell us where frequencies are amplified. (This is a bit of a simplification, as both poles and zeros can affect both gain and attenuation, but let’s go with it for now.)

Continuing our previous example, we’ll use the tf2zpk function to compute the zeros and poles for our filter:

zeros, poles, gain = scipy.signal.tf2zpk(b, a)

Again, this will produce an array of 9 complex numbers for poles:

array([0.6640386 -0.61973512j, 0.54573487-0.49475044j,
       0.44310112-0.35506889j, 0.36783278-0.18847207j,
       0.3394446 +0.j        , 0.36783278+0.18847207j,
       0.44310112+0.35506889j, 0.54573487+0.49475044j,
       0.6640386 +0.61973512j])

corresponding to the 9 roots of \(\cyan{A(z)}\). Like we did above with the zeros plot, we can visualize the position of the poles in the complex plane. Typically, both the poles and zeros are illustrated in the same figure, which is helpfully known as a pole-zero plot.

Poles and zeros of a type-II Chebyshev plot and corresponding frequency response curve

Fig. 12.7 Left: The poles (\(\cyan{\times}\)) and zeros (\(\red{\circ}\)) of the filter previously analyzed in Fig. 12.5. Right: the frequency response curve, now with both pole and zero angles (frequencies) marked. Note that the poles of this system are not on the unit circle, which is why the frequency response curve does not spike upward to infinity at each pole.#

Once you learn how to read pole-zero plots, they can be a great way to quickly understand the behavior of a system. For example, in Fig. 12.7, we have zeros at high frequencies (angles approaching \(\pi\)) and poles near low frequencies, so we can infer that this is a low-pass filter. A high-pass filter would exhibit the reverse behavior, with poles at larger angles and zeros at smaller angles.

The fact that zeros land exactly on the unit circle tells us that some frequencies will be attenuated severely (practically to zero).

12.4.4. Factored transfer functions#

Once we have the zeros \(z_k\) and poles \(p_k\) of a system, these can be used to write down the transfer function \(\red{H(z)}\) equivalently in a factored form:

\[\red{H(z)} = \frac{\left(z - \red{z_0}\right)\cdot \left(z - \red{z_1}\right) \dots \left(z - \red{z_{K-1}}\right)}{\left(z - \cyan{p_0}\right)\cdot \left(z - \cyan{p_1}\right) \dots \left(z - \cyan{p_{K-1}}\right)}\]

While this doesn’t change the definition of the filter — it is just another way of writing the same transfer function — this form does have some benefits.

First, it allows us to reason about high-order filters as a cascade of low-order filters:

\[\begin{align*} \red{H(z)} &= \left(\frac{z-\red{z_0}}{z-\cyan{p_0}} \right)\cdot \left( \frac{z-\red{z_1}}{z-\cyan{p_1}}\right) \dots \left( \frac{z-\red{z_{K-1}}}{z-\cyan{p_{K-1}}}\right)\\ &= H_0(z) \cdot H_1(z) \dots H_{K-1}(z). \end{align*}\]

This observation is often used to simplify the implementation of IIR filters and improve numerical stability (without changing the filter’s behavior). Most commonly, this is done by the second-order sections (SOS) method, e.g.:

# Construct a filter using second-order sections instead of
# feed-forward / feed-back coefficients b and a
sos = scipy.signal.cheby2(order, attenuation, wn, fs=fs, output='sos')

# Instead of lfilter with b and a, use sosfilt
y = scipy.signal.sosfilt(sos, x)

# Or for bidirectional filtering:
y = scipy.signal.sosfiltfilt(sos, x)

Second, it provides a way to reason about the interactions between poles and zeros. Notably, if we have a pole which is also a zero, that is, \(p_k = z_\ell\), then both of their corresponding factors can be removed from the filter without changing its behavior. Concretely, this means that if a pole and a zero coincide, then their effects cancel each other out.

12.4.5. Stability#

Let’s return to our first example of an unstable system, \(\purple{y[n]} = \blue{x[n]} + 2\cdot \purple{y[n-1]}\). This is unstable because the output can diverge when given an input signal with finite values, e.g., a unit impulse produces a sequence \(\purple{y[n]} = 2^n\) that grows without bound and never settles down.

In standard form, this system has coefficients

\[\begin{align*} \red{b} &= [1]\\ \cyan{a} &= [1, -2]. \end{align*}\]

The transfer function of this system is

\[\red{H(z)} = \frac{\red{b[0]} \cdot z^0}{\cyan{a[0]} \cdot z^0 + \cyan{a[1]} \cdot z^{-1}} = \frac{\red{1}}{\cyan{1} - \cyan{2}\cdot z^{-1}}\]

Because the numerator has degree 0, this system has no zeros. It has one pole, which we can find by solving \(1 - 2z^{-1} = 0\):

\[1 - 2z^{-1} = 0 \quad \Rightarrow \frac{2}{z} = 1 \quad \Rightarrow \cyan{z} = 2.\]

Contrast this filter with the first IIR filter we encountered in the previous chapter, the exponential moving average (11.1):

\[\purple{y[n]} = \red{\frac{1}{2}}\cdot \blue{x[n]} + \cyan{\frac{1}{2}} \cdot \blue{y[n-1]},\]

which has coefficients \(\red{b = [1/2]}\) and \(\cyan{a= [1, -1/2]}\). This filter also has no zeros, and one pole, but the pole is located at \(z = 1/2\). It turns out that the exponential moving average filter is stable: its output does not diverge for any input signal \(\blue{x[n]}\) with finite values.

So what’s different about these two examples?

It turns out that the stability of a filter depends on the location of the poles, and specifically, on their magnitude.

IIR stability

Let \(\red{H(z)}\) denote the transfer function of a linear, IIR filter with zeros \(\red{z_0, z_1, \dots, z_{K-1}}\) and poles \(\cyan{p_0, p_1, \dots, p_{K-1}}\), and assume that no pole is also a zero.

The filter is stable if all poles \(\cyan{p}\) have magnitude \(|\cyan{p}| < 1\). Or, equivalently, if all poles lie within the unit circle in the complex plane.

The filter is unstable if any pole \(\cyan{p}\) has magnitude \(|\cyan{p}| > 1\).

This fact gives us a simple test for stability of a filter:

  1. Compute the poles of the filter, e.g., by scipy.signal.tf2zpk or scipy.signal.sos2zpk.

  2. Discard any poles which are also zeros.

  3. Compute the magnitude of all remaining poles.

  4. If any pole has magnitude larger than 1, the filter is unstable.

As a corollary to this observation, note that FIR filters have no poles (since \(\cyan{A(z)= a[0] = 1 \neq 0}\)). Since they have no poles, they also have no poles with magnitude larger than 1. This is how we can say that any FIR filter is stable.

illustration of stable and unstable regions of a filter

Fig. 12.8 If all poles are contained within the unit circle, that is \(|p| < 1\) whenever \(A(p) = 0\), then the system is stable.#

12.4.6. Summary#

In this section, we’ve seen how the z-transform allows us to reason about the behavior of a filter through its transfer function. Specifically, we examined the polynomials \(\red{B(z)}\) and \(\cyan{A(z)}\) which make up the transfer function \(\red{H(z)} = \red{B(z)} / \cyan{A(z)}\). The roots of these polynomials determine when the transfer function goes to 0 (when \(\red{B(z)} = 0\), the zeros) and when it diverges to infinity (\(\cyan{A(z)} = 0\), the poles). Finally, the location of the poles of a filter (inside or outside the unit circle) determines whether the filter is stable or unstable.