Digital Signal Processing Overview

R. Port, October 3, 2007

I. Digital Representation of a Waveform

A digital signal processing system takes a continuous sound wave as input, feeds it through an analog low-pass filter (an anti-aliassing filter) to remove all frequencies above half the sampling rate (see Nyquist's sampling theorem).  This Analog-to-Digital Converter (ADC) filters and samples the wave amplitude at equally-spaced time intervals and generates a simple list of ordered sample values in successive memory locations in the computer. The sample values, representing amplitudes, are encoded using some number of bits that determines how accurately the samples are measured.  The Processor is a computer that applies numerical operations to the sampled waves. In the figure below, it appears the processor has low-pass filtered the signal, thus removing the jumpy irregularities in the input wave. When the signal is converted back into an audio signal by the Digital-Analog Converter (DAC), there will be jagged irregularities (shown below on this page) that are quanization errors. Of course, these will lie above the Nyquist frequency, so the new analog signal (back in realtime) needs another analog, low-pass filter on output since everything above the Nyquist frequency is a noisy artefact.

DSP system

A. Sampling theorem: `Nyquist freq' = (sampling rate)/2

The image “http://ccrma.stanford.edu/CCRMA/Courses/220a:1996/Lectures/2/Images/_03a_Aliasing.gif” cannot be displayed, because it contains errors.
B. Quantization of amplitude (limited set of amplitude values).   When the sampled signal is converted back into realtime, of course, there are only assigned values specified at the sample points. The output  signal will just be flat until the next sample comes along. These flat spots will be perceived by a listener as a high-frequency signal (above the  Nyquist freq), but it will be noise .  So the red curve below must be smoothed (lowpass filtered) into the green curve below in order to sound right.  Of course, if the sample rate is that of the commercial CD standard, then this noise will be above the limits of human hearing -- so your own ear can serve as the  lowpass quantization filter.  Indeed, since loudspeakers do not normally produce sounds above 20 kHzm, they too can also serve as this lowpass output filter.quantization noise
  • Quantization noise - the difference between the original and the quantized signal. The difference between the green signal curve and the red output curve is copied below in blue.
  • Output filters are needed (to smooth away quantization noise)
  • For the commercial CD audio standard (sestablished in 1980), quantization is 16 bits = 2^16 = 65,536 amplitude values
  • Thus, 1 second of audio at CD quality requires 16 bits X 44.1k = 705,600 bits. Of course, for speech research, one normally needs only about 5 kHz bandwidth (for a sampling rate of 10kHz) - about half of the CD standard.

  • II. Digital Filtering - numerical methods for filtering sampled signals.  Sample values at each point in time are identified with the integers, n.  The sampled amplitudes are identified here as Xn and ans will be the coefficients for modifying each amplitude.The image “http://cara.gsu.edu/courses/MI_3110/filters1/fltimg3.jpg” cannot be displayed, because it contains errors.

    A. Primitive low-pass digital filter - For example, a running average of 2 (or more) adjacent samples. Thus, take sample Xn and sample Xn+1, add their amplitudes, and divide by 2. This is equivalent to taking half the amplitude of sample Xn and adding half of Xn+1. (The mathematical term for such a `running average-like' calculation is convolution: apply the operation, move over one time step, apply it again. Repeat until you run out of samples.)  The number of coefficients used is the order of the filter, so the `running average of 2 adjacent samples' has order 2.

  • more generally, for a primitive smoothing (or low-pass) filter,
  • Y0 = a-2X-2 + a-1X-1 + a0X0 +...amXm, where
    (1) X0 is an input sample value, X-1is the previous sample value and Y0 is an output sample value (interpret the digits and m as subscript indices)
    (2) the a's are filter coefficients that are multiplied times the sample amplitudes, X, and the coefficients, a generally sum to 1 (although sometimes the coefficients can be negative. Of course, if they sum to something greater than 1, then the filter is also an amplifier) and
    (3) m (number of coefficients) is called the `order' of the filter.
    The image “http://www.wavemetrics.com/Products/igorpro/dataanalysis/signalprocessing/smoothingpix/movingaveragedemo.png” cannot be displayed, because it contains errors.
    impulse response
    B. Primitive high-pass filter
    Y0 = a0X0 - a1X1
    C. General properties III. Fast Fourier Transform, a method for doing general Fourier analysis on sampled signals efficiently.  In order for the method to be applied, one must specify the number of sample points employed (that is, the time window for the spectrum) as a power of 2 (that is, as either 4, 8, 16,...256, 1024,..., etc). Sampling, eg, at 16k Hz means 1 ms = 80 samples, so a 256 point FFT looks at 3.2 ms and a 512 point FFT looks at 6.4 ms.  At the CD standard, 512 sample points is about 1 ms.

    IV. Linear Predictive Coding (LPC). A method of speech coding that constructs a digital filter that will model a short segment (called a frame) of a speech waveform using many fewer bits than the explicit waveform itself requires. The coefficients and other parameters are then used to resynthesize the original speech.LPC image

    The image “http://svr-www.eng.cam.ac.uk/~ajr/SA95/img75.gif” cannot be displayed, because it contains errors.