3.6. Linearity and Shift-invariance#

We’ll end this chapter with a discussion of two of the more advanced properties of convolutional filters: linearity and shift-invariance. These are general concepts which can be applied to more than just convolutional filters, but for now, we’ll focus on the specific case of convolution.

Taken together, these two conditions (linearity and shift-invariance) are often referred to as LSI, and a system satisfying these properties is called an LSI system.

These properties might seem abstract for now, but they will become important later on when we need to analyze the behavior of filters by using the Fourier transform.

3.6.1. Shift-invariance#

The idea of shift-invariance is that delay can be applied either before or after a system \(g\), and produce the same result.

Formally, let \(\Delta\) denote a \(d\)-step delay filter for some fixed but arbitrary delay \(d \in \mathbb{N}\):

\[ \Delta = [\underbrace{0, 0, \dots, 0}_{d \text{ times}}, 1] \]

so that convolving a signal \(x\) with \(\Delta\) yields

(3.4)#\[(\Delta * x)[n] = x[n-d].\]

Definition 3.2 (Shift invariance)

A filter \(g\) is shift-invariant if for all delays \(d\) (implemented by delay filter \(\Delta\)) and all input signals \(x\), the following is true:

\[\red{g}(\purple{\Delta} * \blue{x}) = \purple{\Delta} * \red{g}(\blue{x}).\]

In plain language, equation Definition 3.2 says that if we delay \(x\) by \(d\) samples, and then process the delayed signal by the system \(g\), we will get the same result as if we had first applied \(g\) to \(x\) (without delay) and then delayed the result.

Theorem 3.1 (Convolution is shift-invariant)

Let \(g(x) = h * x\) be a convolutional system for some impulse response \(h\) and any input signal \(x\).

Any such \(g(x)\) is shift-invariant, and satisfies Definition 3.2.

Proof. We need to verify that

\[\red{g}(\purple{\Delta} * \blue{x}) = \purple{\Delta} * \red{g}(\blue{x}).\]

Using the assumption that \(g\) is convolutional, we can appeal to the associative and commutative properties stated in the previous section:

\[\begin{align*} \red{g}(\purple{\Delta} * \blue{x}) &= \red{h} * (\purple{\Delta} * \blue{x}) & \text{ apply }g\text{ to delayed input}\\ &= (\red{h} * \purple{\Delta}) * \blue{x} & \text { associative rule}\\ &= (\purple{\Delta} * \red{h}) * \blue{x} & \text { commutative rule}\\ &= \purple{\Delta} * (\red{h} * \blue{x}) & \text { associative rule}\\ &= \purple{\Delta} * \red{g}(\blue{x}) & \text { definition of }g. \end{align*}\]

This gives us a shortcut to showing shift-invariance: if you can implement a system \(g\) in terms of convolution, it automatically satisfies shift-invariance.

3.6.2. Linearity#

Linearity is another important characteristic of many systems, including convolution. Broadly speaking, it encapsulates our notions of gain and mixing of signals.

Definition 3.3 (Linearity)

A system \(\red{g}\) is linear if for any two \(\blue{\text{signals}}\) \(\blue{x_1}\) and \(\blue{x_2}\), and for any pair of \(\purple{\text{numbers}}\) \(\purple{c_1}\) and \(\purple{c_2}\), the following holds:

\[\red{g}(\purple{c_1} \cdot \blue{x_1} + \purple{c_2} \cdot \blue{x_2}) = \purple{c_1} \cdot \red{g}(\blue{x_1}) + \purple{c_2} \cdot \red{g}(\blue{x_2}).\]

This might look like a tangle of symbols, so it helps to think about a few special cases. For instance, if we take an example where \(\blue{x_2[n] = 0}\) is a silent (all zeros) signal, and \(\purple{c_2} = 0\), then we get the simpler case

\[\red{g}(\purple{c_1} \cdot \blue{x_1}) = \purple{c_1} \cdot \red{g}(\blue{x_1}),\]

which is also known as homogeneity. This says that gain can be applied before or after the system \(\red{g}\) with no difference in the output.

Similarly, if we take \(\purple{c_1 = c_2 = 1}\), then we get

\[\red{g}(\blue{x_1 + x_2}) = \red{g}(\blue{x_1}) + \red{g}(\blue{x_2}),\]

which is also known as additivity. This says that signals can be mixed before applying \(\red{g}\), or after, and the result will be the same. We’ve already seen this particular case before, when discussing how convolution distributes over addition.

Definition 3.3 combines these two ideas into one compact form.

Theorem 3.2 (Convolution is linear)

Let \(g(x) = h * x\) be a convolutional system for some impulse response \(h\).

Then \(g(x)\) satisfies Definition 3.3 and is a linear system.

To prove Theorem 3.2, let \(h\) be an impulse response, and let \(x_1\) and \(x_2\) denote arbitrary signals, and \(c_1, c_2\) denote arbitrary numbers.

We could prove that \(g\) is linear by brute force, using the definition of convolution and working through the algebra directly. Many people will find this exercise somewhat tedious, but it is possible.

We’ll take a slightly different route here though, relying on facts we already know about convolution.

Proof. Recall that we can implement gain as a convolutional system by creating an impulse response with only one element: Gain. Let \(C_1 = [c_1]\) and \(C_2 = [c_2]\), so that

\[\begin{align*} \purple{c_1} \cdot \blue{x_1[n]} &= (\purple{C_1} * \blue{x_1})[n]\\ \purple{c_2} \cdot \blue{x_2[n]} &= (\purple{C_2} * \blue{x_2})[n]. \end{align*}\]

Then, we can show linearity in terms of associative, commutative, and distributive properties:

\[\begin{align*} \red{g}(\purple{c_1} \cdot \blue{x_1} + \purple{c_2} \cdot \blue{x_2}) &= \red{h} * (\purple{c_1} \cdot \blue{x_1} + \purple{c_2} \cdot \blue{x_2}) & \text{ apply }g(x) = h*x\\ &= \red{h} * (\purple{C_1} * \blue{x_1} + \purple{C_2} * \blue{x_2}) & \text { use definition of } C_1, C_2\\ &= (\red{h} * \purple{C_1} * \blue{x_1}) + (\red{h} * \purple{C_2} * \blue{x_2}) & \text { distributive rule}\\ &= (\purple{C_1} * \red{h} * \blue{x_1}) + (\purple{C_2} * \red{h} * \blue{x_2}) & \text { commutative rule (twice)}\\ &= \purple{C_1} * (\red{h} * \blue{x_1}) + \purple{C_2} * (\red{h} * \blue{x_2}) & \text { associative rule (twice)}\\ &= \purple{C_1} * \red{g}(\blue{x_1}) + \purple{C_2} * \red{g}(\blue{x_2}) & \text { definition of }g\\ &= \purple{c_1} \cdot \red{g}(\blue{x_1}) + \purple{c_2} \cdot \red{g}(\blue{x_2}) & \text { definition of }C_1, C_2\\ \end{align*}\]

Note that the use of convolution to implement gain here is mainly for convenience: having all operations expressed as either sums or convolutions buys us a bit of notational simplicity. That said, it is still completely valid to prove this from first principles, but the derivation would be longer.

3.6.3. Systems which are not LSI#

Not all filtering operations you might want to do satisfy the LSI conditions. Some might be only linear, some might be only shift-invariant, and some might be neither. Identifying these properties can be helpful for understanding both how a particular system behaves, and how it can be combined with others. In particular, since all convolutional systems are LSI, we can infer that any system which is not LSI cannot be implemented by a convolution.

There are many systems which are shift-invariant, but not linear. Generally speaking, any system which operates on each sample value independently (like gain) will be shift-invariant. Linearity will then depend on how the sample values are processed.

Example 3.1 (Clipping)

A clipping system limits the output of a system so that it cannot be less than some minimum value \(v_-\) or greater than a maximum value \(v_+\). In equations, this looks like:

\[\begin{split} y[n] = \begin{cases} v_+ & x[n] \geq v_+\\ v_- & x[n] \leq v_-\\ x[n] & \text{otherwise} \end{cases}. \end{split}\]

Equivalently, in code this is expressed as:

def clip(x, vmin, vmax):
    # This behavior is implemented by np.clip,
    # but we provide a full implementation
    # here for reference.
    
    N = len(x)
    y = np.zeros(N)
    
    for n in range(N):
        if x[n] >= vmax:
            y[n] = vmax
        elif x[n] <= vmin:
            y[n] = vmin
        else:
            y[n] = x[n]
    return y
an illustration of clipping a signal's sample values to lie within a specified range

Fig. 3.8 Top: an input signal \(x\). Bottom: the result of clipping \(x\) to lie between \(v_-\) and \(v_+\) (dashed lines).#

Remember that if a system is linear, it must be linear for all signals \(x_1,x_2\) and scalars \(c_1,c_2\). To show that clipping is not linear, we only need to find one counter-example case: a setting of \(x_1, x_2, c_1, c_2\) for which it fails to hold.

Imagine taking a signal \(\blue{x_1[n]} = v_+ > 0\) (for all \(n\)), letting \(\blue{x_2 = 0}\), and setting \(\purple{c_1=2}\). Then

\[\red{g}(\purple{c_1} \cdot \blue{x_1}) = \red{g}(\purple{2} \cdot \blue{v_+}) = v_+\]

but this is not equal to \(\purple{c_1} \cdot \red{g}(\blue{x_1}) = \purple{2} \cdot \blue{v_+}\). We can then conclude that the system cannot be linear because it does not preserve gain.

In general, showing that a system is not linear requires some insight about how the system operates. In this case, we exploited the fact that the behavior of \(g\) changes when the input is above \(v_+\), and used that to construct a counter-example. This type of argument often seems obvious in hindsight, but creating counter-examples is a skill that takes practice to develop.

Just as in the previous case, there are many systems which are linear, but not shift-invariant. In this case, the dependence to look at is on the sample indices.

Example 3.2 (Time-reversal)

A time-reversal system does exactly what it sounds like: it plays the signal backward. Mathematically, this is done by swapping the sample at index \(n\) with the one at index \(N-1-n\):

\[y[n] = x[N - 1 - n]\]

or in code,

def reverse(x):
    # This could be equivalently done with the one-liner:
    #   return x[::-1]
    
    N = len(x)
    y = np.zeros(N)
    
    for n in range(N):
        y[n] = x[N-1-n]
    return y

Note that the previous example (clipping) can operate independently on each sample, and it does not need access to the entire signal at once to operate correctly. However, the time-reversal system does need to see the entire signal in advance.

To show that time-reversal is not shift-invariant, we again must construct a counter-example. Here the key property that we’re looking at is symmetry (in time), so we should probably consider signals which do not look the same forwards as backwards. Sinusoids and square waves probably won’t work, but a sawtooth will do nicely.

After that, we need only check what happens when we delay the input by some number of samples. Here, we’ll take a 5-step delay, but any positive number would do.

a signal is delayed and reversed in time

Fig. 3.9 Top: a signal \(x[n]\) and a 5-sample delayed copy. Bottom: applying time-reversal before or after delaying the input produces different output signals.#

Fig. 3.9 shows that changing the order of operations (delay first or reverse first) matters, since the two signals in the bottom plot are not identical. From this example, we can conclude that time-reversal is not shift-invariant.

3.6.4. Summary#

What we’ve seen here is that all convolutional systems are both linear and shift-invariant, and if a system fails to satisfy either of these conditions, then it cannot be implemented by convolution.

The connection here is even stronger than that: all LSI systems are convolutions as well. Proving that this is true is beyond the scope of this text, and requires some slightly more sophisticated mathematics. However, the equivalence between convolutions and LSI systems is one of the most powerful concepts in all of signal processing.