Convolution

Convolution

Task: 1-D Convolution of two functions: an input signal of the type $$x[i] = \sin(2\pi \cdot \frac{i}{10}) + \sin(2\pi \cdot \frac{i}{50}) + \sin(2\pi \cdot \frac{i}{100})$$ and a Gaussian mask (un-normalised) of the type $m[i] = e^{- \frac{(i-15)^2}{8}}$

This code is feature complete but I am yet to make some linting changes.


Unlike last time, I'm going to calculate the input and mask vectors at compile time - so I don't fry my laptop at runtime. Found out I can't do this yet. std::sin and std::exp - which are used by the signal generator functions actually can only be used at runtime. Maybe in cpp26 - but we'll see.

Results

Here is a view of the input signal $x[i] = \sin(2\pi \cdot \frac{i}{10}) + \sin(2\pi \cdot \frac{i}{50}) + \sin(2\pi \cdot \frac{i}{100})$

"Input Signal"

Gaussian Mask array where $S$ is the sum to normalise the function: $m[i] = \frac{1}{S} e^{- \frac{(i-15)^2}{8}}$

"Gaussian Mask"

Output array $y[n] = \sum_{k=0}$

"Output Signal"

What is the convolution?

I think the Wikipedia page has the best visual explanation, but I'll explain it here as well.

  1. Convolution shows how one signal can be used to shape another signal.
  2. Convolution in the time-domain (as done above) is equivalent to multiplying in the frequency domain.

I find it is more intuitive to understand this in the frequency domain. Every signal is made up of several component frequencies. Now if we want to better hear a specific frequenc(ies) we need to filter out what we don't want: the noise. The simplest way to do this mathematically is to nullify the noise and, if necessary, amplify our target signal. If we represent the components of the signal as elements of a matrix, this is the Hadamard product.

This process of 'nullifying' and 'amplifying' a signal is done by transforming the signal into its frequency components and multiplying those components by a frequency-domain mask. It is intuitive therefore that the mask can be a 'box-shaped' signal where the peak portion (called the support) is multiplied with the target frequency component(s) and the zero portion with the noise.

This box-shaped signal (called box function) becomes $sinc(t) = \frac{\sin(\pi t)}{\pi t}$ in the time domain. More practically, since the $sinc(t)$ function theoretically spans upto infinity, we use a Gaussian mask (pictured above).


As a sidenote, in the above explanation notice that convolution (from a compute POV) is similar to matrix multiplication. The difference being that the Hadamard product is a simpler, element-by-element multiplication - not full matrix multiplication.


Addendum - I messed up

When I first ran this code, I got the following output signal:

"WRONG Output Signal"

The value here goes to zero at index $8$ and then stays there. This didn't seem right and after a lot of head scratching, I found my error. In convolution.metal, I had initially done this: First, we do a bounds check:

if (thread_id >= input_width) {
  return;
}
const uint center = static_cast<int>(mask_width/2);

All okay so far. Then, in the logic, I had:

for (uint i = 0u; i < mask_width; i++) {
  const int offset = static_cast<int>(i) - center; // ! Type mismatch
  // ...
}

Subtracting a uint from an int makes both vars uints before the operation. The fix? Just change the type of the center to int.

Okay, but that doesn't explain why values become zero after index $8$. Here's the issue, in my original main.cc file, I had:

pCommandEncoder->dispatchThreads(numGroups, threadsPerThreadgroup);

So dispatchThreads sets the number of threads per grid. But the logic I used was for threadsPerThreadgroup. To fix this, I've changed the numGroups to:

pCommandEncoder->dispatchThreads(MTL::Size(INPUT_SIZE, 1, 1), threadsPerThreadgroup);