Sounds Logical
home

WaveWarp 2.0 Component

IndexCurrent GroupPrevious GroupNext GroupPrevious ComponentNext ComponentBottom of Page

Functional Description | Algorithm | Signal Implementations | Related Components | Example DWBs usage

Spectral Transformers:

Tracking 3-Peak Detector

Functional Description
Determines the frequency and magnitude of the three dominant spectral peaks in the audio input (over a selected duration), and sends the information out via the control outputs. The spectrum is computed via the FFT, and the peaks are "tracked" in time by comparing successive spectral snapshots. This enables the actual location of the peaks to be determined, irrespective of the width of the FFT bins, thereby circumventing the limit on frequency resolution imposed by the finite length of the input data buffer. See also the Spectral 3-Peak Detector for the simpler implementation (which is subject to the resolution limitations of the FFT).
Algorithm
Each successive input buffer is converted to the frequency domain using the windowed-FFT with double overlapping. A simple search for the maximum magnitude value across all the FFT bins reveals the approximate location of the dominant peak (within the resolution of the FFT). Two further simple searches across all FFT bins on either side of the dominant peak reveal the approximate locations of the next two peaks. By comparing the approximate peak locations across successive FFT frames, the actual locations can be effectively pin-pointed, according to the "tracking phase vocoder" technique presented in [Moorer1] and [Moore] p. 570-573.

The behaviour of the tracking peak detection depends on the settings of the FFT, as summarised in the following table.

Parameter Purpose
"Buffer length" slider Adjusts the length of the input data buffer, which also defines the "Latency" (overall delay) of the process. The FFT buffer size is computed from the "Buffer length" rounded up to the nearest power of 2 (for efficient FFT computation). If "double padding" is selected, the FFT buffer size is doubled (after the rounding) to improve the smoothness of the spectrum between successive FFT bins (but without increasing the underlying frequency resolution). The input data buffer is windowed (using a selected profile), then extended to the length of the FFT buffer by padding with zeros (on either side). Successive input data buffers are overlapped by a factor of 2.

The appropriate choice of "Buffer length" is a trade-off between frequency resolution (improved with longer buffers) and temporal resolution (degraded with longer buffers). For a given application, the most suitable choice usually depends on the characteristics of the audio signal. The buffer length should be long enough that the dominant peaks are located within separate FFT bins. Thereafter, the peak-tracking technique will narrow down their location inside the bins, so the buffer length does not need to be further increased.

"Input gain" slider Adjusts the overall amplitude of the input signal, before computing the FFT.
"Bin masking" slider Determines the "masking" zone (expressed in terms of the number of FFT bins) around each detected peak. The search for the next dominant peak will exclude all FFT bins within the masking zone. This reduces the possibility of false detection caused by "spectral smearing" in the vicinity of a peak (arising from the fact that the peak may not lie within a single bin, for a given FFT resolution).
"Window type" selection Selects the profile of the windowing function applied to the input data.

For an introduction to the Discrete Fourier Transform and the FFT, see, for example, [St] sections 4.1 and 4.2. For further introductory information (with emphasis on audio applications), and for discussions on spectral measurements, zero-padding, windowing, and the Short Time Fourier Transform (STFT) for audio applications, see [Roa] p. 1084-1112 and [Moore] p. 61-111.

Signal Implementations
Audio signals Control signals Description
Single input mono6 outputs The first and second control outputs contain the frequency (in Hz) and the magnitude, respectively, of the dominant spectral peak of the mono input. The third and fourth control outputs contain the frequency (in Hz) and the magnitude, respectively, of the second (or third) dominant peak. The fifth and sixth control outputs contain the frequency (in Hz) and the magnitude, respectively, of the third (or second) dominant peak. Note that the second and third dominant peaks are not necessarily ordered in terms of the control ouput sequence (this reduces the computational burden).
Single input stereo6 outputs The first and second control outputs contain the frequency (in Hz) and the magnitude, respectively, of the dominant spectral peak of the average of the stereo input channels. The third and fourth control outputs contain the frequency (in Hz) and the magnitude, respectively, of the second (or third) dominant peak. The fifth and sixth control outputs contain the frequency (in Hz) and the magnitude, respectively, of the third (or second) dominant peak. Note that the second and third dominant peaks are not necessarily ordered in terms of the control ouput sequence (this reduces the computational burden).
Related components:
Example DrawingBoards illustrating usage:

IndexCurrent GroupPrevious GroupNext GroupPrevious ComponentNext ComponentTop of Page

home - news - products - store - support - site map - company info
© 2007 Sounds Logical. All rights reserved.
Sounds Logical
legal notice - privacy statement