A Visual Way to Understand Fourier Transform 

Introduction

The Fourier transform is a powerful tool that engineers and scientists use to decompose time-series data into its frequency or Fourier domain. It is essentially a mathematical technique used for transforming functions of time into functions of frequency. Let's delve deeper into how it works.

Dot Product as Similarity Measure/Matching Algorithm

Before discussing the formula for the Fourier transform, it's crucial to understand the concept of the dot product. A dot product is obtained by performing element-wise multiplication on two time-series datasets represented as vectors (or arrays of numerical values), and then summing up the results. In a geometric sense, a vector represents a data point in a geometric space such as 2D plane and contains magnitude and angle information. Think of it as an arrow pointing in a specific direction with a certain length. The 'coordinates' of this arrow (or vector) in a 2-dimensional space can be described as an array of numerical values, say (x,y). 

Consider the dot product as a measure of similarity between two vectors. Two vectors that are close or overlapping (i.e., they point in the same or very similar directions) will have a large dot product, indicating high similarity. On the other hand, vectors that are at an angle to each other (i.e., they point in different directions) will have a smaller dot product, indicating lower similarity. Vectors that are orthogonal (i.e., at a 90-degree angle to each other) will have a dot product of zero, indicating no similarity. In short, dot product operation returns the matching score between two signals represented as vectors.

Let's illustrate these concepts with some visual examples:

The two plots above visually demonstrate the concept of the dot product as a measure of similarity or "closeness" between two vectors in a 2D space. 

This illustrates how the dot product can be used to measure the degree of similarity between two vectors: the larger the dot product, the more similar the vectors are in terms of their direction and magnitude. Conversely, a smaller dot product indicates less similarity, and a dot product of zero indicates no similarity at all (i.e., the vectors are orthogonal). 

Use Basis Vectors for Any Vector Representation

However, to express this vector more concisely, we use the concept of basis vectors. These are simply vectors that point in the X, Y, and Z directions, are perpendicular to each other, and have a length of 1. The requirement for these basis vectors to be perpendicular ensures their independence, meaning a change in the magnitude of one basis vector doesn't affect the others. This independence facilitates decoupling of mathematical operations for each orthogonal component of the resultant vector. Following illustrate how to construct a given 3D vector as a sum of scaled basis vectors (X, Y, Z). Each basis vector represents one direction (X, Y, or Z). 

The length of each basis vector is 1 unit, and the scaling factors for each scaled vector are labeled in the plot. This demonstrates how the original vector can be expressed as a linear combination (i.e. vector sum of scaled basis vectors aka a dot product) of its components along the basis vectors. 

Matrix-Vector Dot Product Operation and Its Relation to Eigenvalue Problem 

Any vector can be represented as a product of a matrix and a vector, denoted as AX = B. In this representation, 'A' is a 3x3 matrix containing the basis vectors for a 3D space, 'X' is a 3x1 vector that describes the scaling factors for each basis vector, and 'B' is the final vector we aim to represent.

This operation is closely related to the eigenvalue problem, expressed as AV = λV. In this equation, 'V' denotes the eigenvectors, and 'λ' represents the eigenvalues or scaling factors. This equation essentially states that scaling any eigenvector 'V' by matrix 'A' results in a linear scaling of 'V' itself.

In simpler terms, this suggests that we can form any vector in a given space by scaling each basis vector and then summing them up. This concept is fundamental in understanding many mathematical operations, including the Fourier transform."

Below is an illustration of this concept.

The figure above illustrates the concepts of scaling a vector by a factor λ and transforming a vector by a basis matrix side by side. On the right the basis vector A is transformation is an identify matrix which means that it does not change the vector. On the left, the blue array represents a 3D vector of (1,1,1). and red arrow represents which is the original vector v scaled by a factor λ=2  This shows that multipling a vector by a scalar only changes it's magnitude but the direction is preserved.

Understanding the Discrete Fourier Transform: Treating it as a Matrix-Vector Dot Product Operation with Fourier Basis Vectors 

Before delving into the Fourier transform, let's revisit the concept of the dot product. Imagine having two vectors of similar lengths but pointing in different directions. How would we quantify their similarity? The dot product serves this purpose, providing a scalar quantity that indicates the degree of similarity. A high value suggests the vectors are highly similar, or even identical, while a value of 0 suggests they are orthogonal (perpendicular).

Now, armed with the understanding of the dot product, eigenvectors and eigenvalues, and the concept of basis vectors, we can comprehend the Fourier transform. Fundamentally, a Fourier transform is a matrix-vector operation that measures the similarity between a time-series dataset and each Fourier basis of various frequencies. The Fourier basis vectors are sinusoidal functions represented in a high dimensional space (i.e., when N >> 3). Any time series data can be expressed as a linear combination of these Fourier basis vectors.

The coefficients of the Fourier transform (which are the magnitudes of the dot products of the Fourier basis vectors with respect to the entire signal) describe how similar the time-series data is to each particular Fourier basis. In essence, the Fourier transform decomposes a time-series dataset into its frequency components, using the dot product operation with respect to different frequency sinusoidal bases."

Practical Implementation of Discrete Fourier Transform

In practice, a signal is represented by a discretized array of numerical values (vectors) often a high dimensional data with N >> 3. The matrix (i.e., the Discrete Fourier Transform, or DFT matrix) contains various Fourier bases that we wish to examine in the Fourier domain. Element-wise multiplication of each row (each Fourier basis vector) with the input signal vector, followed by summing up the results, is precisely what a dot product operation entails. When this operation is carried out for all subsequent rows of the DFT matrix with respect to the input vector data, we obtain an output vector with the same dimension as the input vector. This output vector provides the Fourier coefficients, or measures of similarity, for each Fourier basis vector.

Let's illustrate this with Visual example:

In this example, we first create a simple square wave. We then apply the Fourier transform to it using Matrix Vector product, obtaining a complex-valued output. We take the absolute value of each element in the Fourier-transformed data to get the magnitude spectrum. Finally, we plot both the original signal and its magnitude spectrum and compare the same result with the built-in FFT function.

Here's visual of how this matrix vector product works in practice:

In this Python example, we first create a square wave. We then compute the Discrete Fourier Transform (DFT) matrix. The DFT matrix is multiplied by the square wave (input vector) to obtain the Fourier coefficients.

The square wave, the DFT matrix (visualized as an image), and the Fourier coefficients are then plotted.

This example demonstrates how the Fourier Transform decomposes a time-domain signal into its frequency components. It's a visualization of the mathematical operations involved in the Fourier Transform and the results of applying it to a square wave signal.

A Close Examination of DFT Matrix


The Discrete Fourier Transform (DFT) matrix, as visualized in the enlarged image above, is essentially a collection of Fourier basis vectors. Each row in this matrix represents a different frequency component, and the entire matrix forms a basis set for representing functions in the frequency domain.

The process of applying the DFT to a signal corresponds to taking the dot product of the signal with each of the Fourier basis vectors in the DFT matrix. This dot product operation computes the projection of the signal onto each basis vector, effectively measuring the "similarity" or "contribution" of each frequency component to the overall signal. As we go down the DFT matrix,  Each row, or basis, in this image would, if visualized in 2D space, look like a cosine wave in increasing frequency.  Let's zoom in and look at first severals rows of DFT matrix, plotting just the real part for illustration purpose.

As we move from Row 1 to Row 6, we can see that the frequency of the sinusoids increases. This reflects the fact that the DFT matrix includes basis functions that correspond to a range of frequencies, allowing us to decompose a signal into its frequency components. Even though we only plotted the first 6 rows, the DFT matrix actually contains 64 rows, each corresponding to a different frequency. The higher frequency components are not shown in this plot. 

To fully understand this, let's consider the DFT operation as a matrix-vector multiplication:

Y=MX 

Here, X is our time-domain signal, M is the DFT matrix, and Y is the output of the DFT (the Fourier coefficients). Each element in the output vector Y, say Yk, is computed as the dot product of the k-th row in the DFT matrix M and the input vector X. This can be represented as: 

Yk=Mk X 

This dot product operation effectively measures how much the signal X "aligns" with the k-th Fourier basis vector Mk. The result, Yk, represents the contribution of the frequency corresponding to Mk in the overall signal. A high value of Yk suggests that the frequency corresponding to Mk is a significant component of the signal X, while a low value suggests that the frequency is not present or is only a minor component of the signal. 


Hence, the Fourier Transform operation, as performed by the Discrete Fourier Transform (DFT), can be conceptualized as a sequence of dot product operations. These operations are carried out between the input signal and a set of Fourier basis vectors. The results of these operations yield a measure of how much each frequency component contributes to the overall signal. To put it succinctly, the DFT operation can be thought of as a template matching algorithm. Here, the Fourier basis vectors serve as the templates, and the dot product operation quantifies the similarity or 'match' between the input signal vector and each template vector." 


Let's illustrate visually on how this works.

Note: It's important to know sinusoid basis is expressed as complex exponetials and Dot product is between a real signal and a complex exponentials. We will illustrate the complex exponential as a real and imaginary part in below example.

The plots above illustrate the process of obtaining the Fourier coefficients of a square wave signal by performing a dot product operation between the input signal and the complex sinusoidal basis functions corresponding to the first 8 rows of the DFT matrix.

As can be seen, the basis functions that match the frequency components present in the square wave (i.e., the fundamental frequency and its odd harmonics) result in larger dot product magnitudes, reflecting the fact that these frequencies are the ones that contribute the most to the square wave signal. Conversely, the basis functions that do not match the frequency components in the square wave result in a dot product magnitude close to zero, indicating a small contribution to the signal. 

Summary

In conclusion, the DFT operation measures the similarity between a signal and a set of sinusoidal templates (the Fourier basis vectors) of different frequencies. The Fourier coefficients (the results of the DFT) represent the contribution of each frequency component to the overall signal.