The Fourier Transform

by Eric Riddoch

Nov 30, 2018

Contents

This notebook showcases the advanced mathematics used to build up the Discrete Fourier Transform algorithm. Below, I compare the Fast Fourier Transform to the Naive Approach and then use it to decompose and clean audio signals. At the end, I use the two dimensional version to demonstrate an image cleaning technique.

Part 1: The Discrete Fourier Transform

  1. SoundWave Class
  2. Note Generator
  3. Special Operators
  4. Naive Fourier Transform
  5. Fast Fourier Transform
  6. Plots
Part 2: Convolution and Filtering
  1. Circular Convolution
  2. Linear Convolution
  3. Cleaning Audio
  4. Cleaning Images

Part 1: The Discrete Fourier Transform

In [1]:
from matplotlib import pyplot as plt
import IPython
from scipy.io import wavfile
import numpy as np
from colorama import Fore
from scipy.fftpack import fft, ifft, fft2, ifft2
from scipy.signal import fftconvolve
import imageio
import math
from timeit import default_timer as time
In [2]:
plt.rcParams["figure.dpi"] = 300             # Fix plot quality.
plt.rcParams["figure.figsize"] = (12,3)      # Change plot size / aspect (you may adjust this).

1.1 SoundWave Class

Throughout this demonstration, we interact with .wav files using the following class.

Definitions,

  • SoundWave.samples - For any soundwave $f(t)$, we record the sound by sampling from $f$ at equally spaced times, $t_0 < t_1 < ... < t_{n-1}$ to obtain the vector $\textbf{f} = (f(t_0), f(t_1), ..., f(t_{n-1}))$. We store this vector in SoundWave.samples.
  • SoundWave.rate - refers to the sample rate of an audio clip, which is the constant distance $t_k - t_{k-1}$ for any $1 \le k \le n - 1$.
  • In [3]:
    class SoundWave(object):
        """A class for working with digital audio signals."""
    
        SAMPLE_MAX = 32767
        SAMPLE_MIN = -32768
        
        # constant notes 
        notes = {
            'A' : 440,
            'B' : 493.88,
            'C' : 523.25,
            'D' : 587.33,
            'E' : 659.25, 
            'F' : 698.46, 
            'G' : 783.99, 
            'A_high' : 880
        }
        
        def __init__(self, rate, samples):
            """Set the SoundWave class attributes.
    
            Parameters:
                rate (int): The sample rate of the sound.
                samples ((n,) ndarray): NumPy array of samples.
            """
            self.rate = rate
            self.samples = samples
            self.filename = None
            
        def calculate_hertz(self):
            # return essential information from fft
            n = len(self.samples)
            c_k = abs(fft(self.samples))
            k_indices = np.arange(n)
            frequencies = k_indices * self.rate / n
            return c_k, k_indices, frequencies
    
        def plot(self, title="", x_min=None, x_max=None, 
                 y_min=None, y_max=None, 
                 dft=False):
            """Plot the graph of the sound wave (time versus amplitude)."""
            domain = np.linspace(0, len(self.samples) / self.rate, 
                                 len(self.samples))
            
            if y_min is None and y_max is None:
                y_min = self.SAMPLE_MIN
                y_max = self.SAMPLE_MAX
            
            # case : two subplots
            if dft:
                fig, axes = plt.subplots(1, 2)
                n = len(self.samples)
                
                # plot soundwave against time
                axes[0].plot(domain, self.samples)
                axes[0].set_ylim(y_min, y_max)
                axes[0].set_ylabel("samples")
                axes[0].set_xlabel("seconds")
                axes[0].set_title("Sound Wave")
                
                # calculate hertz
                c_k, k_indices, frequencies = self.calculate_hertz()
                
                # plot magnitude against frequency
                axes[1].plot(frequencies[:n//2], c_k[:n//2])
                axes[1].set_ylabel("amplitude")
                axes[1].set_xlabel("frequency (Hz)")
                axes[1].set_title("Component Frequencies")
                
                if x_min is not None and x_max is not None:
                    axes[1].set_xlim(x_min, x_max)
                
                fig.suptitle(f"Fourier Transform - {title}")
                            
            # case : one subplot
            else:
                plt.plot(domain, self.samples)
                plt.ylim(self.SAMPLE_MIN, self.SAMPLE_MAX)
                plt.ylabel("samples")
                plt.xlabel("seconds")
                plt.title(title)
                
            plt.show()
    
        def export(self, filename, force=False):
            """Generate a wav file from the sample rate and samples. 
            If the array of samples is not of type np.int16, scale it before exporting.
    
            Parameters:
                filename (str): The name of the wav file to export the sound to.
            """
            samples = None
                
            # rescale?
            if self.samples.dtype != np.int16 or force:
                scaled_samples = np.int16( (self.samples * float(self.SAMPLE_MAX))
                                       / max(abs(self.samples)) )
                samples = scaled_samples
            else:
                samples = self.samples
                                          
            wavfile.write(filename, self.rate, samples)
            
        def get_note(self, sound, tol=5):
            """Given a frequency, identify the note it corresponds with."""
            
            # find the note
            for note in self.notes.keys():
                if abs(sound - self.notes[note]) < tol:
                    return note
            
        def __add__(self, other):
            """Combine the samples from two SoundWave objects.
    
            Parameters:
                other (SoundWave): An object containing the samples to add
                    to the samples contained in this object.
            
            Returns:
                (SoundWave): A new SoundWave instance with the combined samples.
    
            Raises:
                ValueError: if the two sample arrays are not the same length.
            """
            
            # componentwise add two sounds
            if len(self.samples) != len(other.samples):
                raise ValueError("DUMB!")
            
            return SoundWave(self.rate, self.samples + other.samples)
    
        def __rshift__(self, other):
            """Concatentate the samples from two SoundWave objects.
    
            Parameters:
                other (SoundWave): An object containing the samples to concatenate
                    to the samples contained in this object.
    
            Raises:
                ValueError: if the two sample rates are not equal.
            """
            
            # concatenate two sound waves
            if self.rate != other.rate:
                raise ValueError("DUMB!")
                
            return SoundWave(self.rate, np.hstack([self.samples, other.samples]))
        
        def __mul__(self, other):
            """Convolve the samples from two SoundWave objects using circular convolution.
            
            Parameters:
                other (SoundWave): An object containing the samples to convolve
                    with the samples contained in this object.
            
            Returns:
                (SoundWave): A new SoundWave instance with the convolved samples.
    
            Raises:
                ValueError: if the two sample rates are not equal.
            """
            if self.rate != other.rate:
                raise ValueError("Rates not equal")
            else:
                if len(self.samples) < len(other.samples):
                    self.samples = np.pad(self.samples,(0,len(other.samples) - len(self.samples)),'constant')
                elif len(self.samples) > len(other.samples):
                    other.samples = np.pad(other.samples,(0,len(self.samples) - len(other.samples)),'constant')
                #convolve arrays
                convolution = ifft(fft(self.samples) * fft(other.samples))
                return SoundWave(self.rate, convolution)
    
        def __pow__(self, other):
            """Convolve the samples from two SoundWave objects using linear convolution.
            
            Parameters:
                other (SoundWave): An object containing the samples to convolve
                    with the samples contained in this object.
            
            Returns:
                (SoundWave): A new SoundWave instance with the convolved samples.
    
            Raises:
                ValueError: if the two sample rates are not equal.
            """
            if self.rate != other.rate:
                raise ValueError("Rates not equal")
            else:
                n = len(self.samples)
                m = len(other.samples)
                length = n + m - 1
                a = np.log2(length)
                a = math.ceil(a)
                
                self_sam = np.pad(self.samples,(0,2**a - n),'constant')
                other_sam = np.pad(other.samples,(0,2**a - m),'constant')
                
                #convolve arrays
                convolution = ifft(fft(self_sam) * fft(other_sam))
                
                #take first n+m-1 elements
                convolution = convolution[:length]
                
                return SoundWave(self.rate, convolution)    
    
        def clean(self, low_freq, high_freq):
            """Remove a range of frequencies from the samples using the DFT. 
    
            Parameters:
                low_freq (float): Lower bound of the frequency range to zero out.
                high_freq (float): Higher boound of the frequency range to zero out.
            """
            
            # take DFT of the sound
            c_k = fft(self.samples)
            n = len(self.samples)
             
            # compute k values to erase
            k_low = int( low_freq * n / self.rate )
            k_high = int( high_freq * n / self.rate )
    
            # delete the frequencies
            c_k[k_low:k_high] = 0
            c_k[n - k_high : n - k_low] = 0
            
            # invert DFT back into sound file
            c_k = ifft(c_k)
            
            # save the new sound
            self.samples = c_k
            
        def set_file(self, filename):
            """update file associated with SoundWave instance"""
            self.filename = filename
        
        def play(self, filename=None, stereo=False):
            """embed an audio player into the notebook"""
            if stereo:
                return IPython.display.Audio(rate=self.rate, data=self.samples.T)
            if filename is None:
                return IPython.display.Audio(rate=self.rate, data=self.samples)
            else:
                return IPython.display.Audio(filename)
            
        @staticmethod
        def play_file(filename):
            return IPython.display.Audio(filename)
            
        @classmethod
        def with_file(cls, filename):
            """instantiate a SoundWave object associated with a particular .wav file"""
            rate, samples = wavfile.read(filename)
            obj = cls(rate, samples)
            obj.set_file(filename)
            return obj
    

    1.2 Note Generator

    generate_note() generates samples from a soundwave with a specified freqency for a specified amount of time.

    In [4]:
    def generate_note(frequency, duration):
        """Generate an instance of the SoundWave class corresponding to 
        the desired soundwave. Uses sample rate of 44100 Hz.
        
        Parameters:
            frequency (float): The frequency of the desired sound.
            duration (float): The length of the desired sound in seconds.
        
        Returns:
            sound (SoundWave): An instance of the SoundWave class.
        """
        
        # create a SoundWave with the given frequency
        RATE = 44100
        num_samples = duration * RATE
        domain = np.linspace(0, num_samples / RATE, num_samples)
        samples = np.sin(2*np.pi * domain * frequency)
        
        return SoundWave(RATE, samples)
    
    # 2 second A note
    generate_note(SoundWave.notes['A'], 2).play()
    
    Out[4]:

    1.3 Special Operators

    Note that in the SoundWave class we define the operators + and >>.

  • + combines two SoundWave objects of the same rate by adding their sample arrays componentwise.
  • >> appends one the sample vector of one SoundWave object to another provided they share the same rate.
  • Below, I demonstrate this by generating a three-second A minor chord (A, C, and E), and play a 'Super Mario' jingle. :D

    In [5]:
    C_Note = generate_note(SoundWave.notes['C'], 3)
    E_Note = generate_note(SoundWave.notes['E'], 3)
    A_Note = generate_note(SoundWave.notes['A'], 3)
    C_Up = generate_note(SoundWave.notes['C'] + 440, 3)
    G = 783.99
    
    # Play A, C, and E together, the arpegio is below
    (C_Note + E_Note + A_Note).play()
    
    Out[5]:
    In [6]:
    # Mario :D
    
    E = SoundWave.notes['E']
    C = SoundWave.notes['C']
    G = SoundWave.notes['G']
    
    (
        generate_note(E, .15) >> generate_note(0, .1) >>
        generate_note(E, .15) >> generate_note(0, .2) >>
        generate_note(E, .15) >> generate_note(0, .2) >>
        generate_note(C, .15) >> generate_note(0, .2) >>
        generate_note(E, .15) >> generate_note(0, .2) >>
        generate_note(G, .5) >> generate_note(0, .5)
    ).play()
    
    /anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:16: DeprecationWarning: object of type <class 'float'> cannot be safely interpreted as an integer.
      app.launch_new_instance()
    
    Out[6]: