The objective of this post is to verify the convolution theorem on 2D images. I will follow a practical verification based on experiments.

In mathematics, the convolution theorem states that under suitable conditions the Fourier transform of a convolution is the pointwise product of Fourier transforms. In other words, convolution in one domain (e.g., time domain) equals point-wise multiplication in the other domain (e.g., frequency domain).

**Figure 1. Convolution Theorem**

Fourier transform (FT) calculates the frequency domain representation of a spacial domain signal, while inverse Fourier transform (IFT) does the opposite; given the frequency domain representation of a signal, it calculates the spacial domain representation of it.

**Figure 2. Fourier Transform and Inverse Fourier Transfrom**

Convolution theorem is not only valid for 1D signals, but also 2D signals. This makes it very useful for many image processing and computer vision applications. Figure 3 shows two example images in spacial domain ( we are used to see images in that domain :) ) and the corresponding representation in frequency domain.

Figure 3. Examples for 2 images in spacial and frequency domains

In our verification experiment, we will apply it to three different images with different level of details: high, medium, and low details. The reason will be clear at the end of this post.

Firstly, let's create a black image W of the same size of I with a small white rectangle in the middle.

For the image I, firstly, we will get the result of

**multiplying the frequency domain of I by W**:- Apply Fourier transform on I, let the result be F
- Calculate the point-wise multiplication of F and W, let the result be F×W
- Get the inverse Fourier transform of F×W, let this result be R1

Now, let's do the inverse process, for the same image I, we will get the result of the

**convolution of the spacial domain of I and the inverse Fourier transform of W**.- Apply inverse Fourier transform on W, let the result be M
- Calculate the convolution of I and M, let the result be R2

As described earlier, the convolution theorem establish that the two processes described above to get R1 and R2 are equivalent; so R1 and R2 should be the same images at the end. If we see that, we verify the convolution theorem on 2D images.

In the first process, we point-wise multiply the input image frequency domain representation by a black image with a small white rectangle in the middle. We eliminate high frequencies and keep low frequencies. So, the spacial domain of the resulted image should be a blurred version image of the input image, because high frequencies in that input image are eliminated.

The smaller the white rectangle in the middle of W is, the more frequencies we remove, and hence, the more blurred image we get. The following figures 4 and 5 show R1 and R2 results for an input image of high detail. (It's form Microsoft PhD Summer School 2015 that I was invited for in Cambridge University.). As it's clear from the figures R1 and R2 are equivalent, which verifies the convolution theorem in 2D. In both results, the white rectangle of W dimensions are 30% of image dimensions.

**Figure 4. R1 for a high detail image, the white rectangle of W dimensions are 30% of image sizes**

**Figure 5. R2 for a high detail image, the white rectangle of W dimensions are 30% of image sizes**

For the same high detail image, if we visualize R1 and R2 but when the white rectangle of W is smaller to have dimensions of 10% of image dimensions instead of 30%. The resulted image becomes more blurred as expected as shown in figures 6 and 7. When the size of the white rectangle W is larger, the level of detail preserved in R1 and R2 increases, i.e., more high frequency components of I are preserved.

**Figure 6. R1 for a high detail image, the white rectangle of W dimensions are 10% of image dimensions**

**Figure 7. R2 for a high detail image, the white rectangle of W dimensions are 10% of image dimensions**

For the white rectangle of W dimensions are 30% of image dimensions, let's use an input image I of low detail already instead of the high detail image we used above. The result is shown in Figure 8.

**Figure 8. R1 for a low detail image, the white rectangle of W dimensions are 30% of image dimensions**

As we can see, using the same W, the result in figure 8 looks less blurred than the result in figure 4. Because in figure 8 experiment, the input image contains less details already. For the same white rectangle W size, the image of high detail losses more information than the low detail image (i.e., looks more blurred). The high frequency components of a highly detailed image contain more information than it in a less detailed image; this is why the highly detailed image loses more information (and looks more blurred) when the same pass filter is applied to both images.

Please note that the obtained spacial domain of W represents the known image low pass filter everyone uses. Also, my MATLAB code in MathWorks FileExchange website conducts more experiments than I've presented here. Figure 9 shows last example for the result using a moderate level of detail image as input.

**Figure 9. R2 for a medium detail image, the white rectangle of W dimensions are 10% of image dimensions**