|
Send Page To a Friend
  
In any operation where the bit-depth of the quantization is reduced (e.g. when re-saving a 24-bit WAV file as a 16-bit WAV file), then it is generally recommended to apply dither to reduce the audible non-linear amplitude distortion caused by the re-quantization. In essence, the application of dither amounts to adding low-level random noise to the signal. By selecting the noise amplitude comparable to the quantization step size, the effect of the dither is to linearize the input-output characteristic of the quantizer, thereby increasing the effective resolution, albeit with the addition of audible noise. The perceived additional noise can be minimized by use of an appropriately-designed noise-shaping filter incorporated within the dithering process.
5.3.1--- Basic dither relations
The input-output relationship for the quantizer can be expressed as follows

(This is a combination of the floating-point-to-integer-to-floating-point conversions presented earlier.) The following figure illustrates the mapping for an 8-bit quantizer (the fact that the midpoint of the mapping is vertical rather than horizontal gives it the name "midriser"):

As evident from the plot, the floor function renders this relationship nonlinear, leading to audible distortion, particularly for signals of sufficiently low level to be comparable with the quantizer step size. The purpose of the dither is to reduce the effects of this nonlinearity. Note that although the quantizer performs a nonlinear deterministic action on the input, a convenient simplification when assessing the noise properties associated with the quantizer is to assume that, to first order, it has the effect of simply adding a random noise:

where the output error, q, is considered to be uniformly-distributed over the range [-Q/2 : +Q/2] and thereby has a mean-square value of:

5.3.2--- Additive dither
The simplest dithering scheme, as sketched below, is to add a random noise, d, to the signal before entering the quantizer. To be effective, the dither must be statistically independent from the signal.
The most basic dither is typically generated from a pseudo-random sequence, uniformly distributed over the range [-Q/2 : +Q/2] i.e. with a peak-to-peak amplitude equal to Q (or to 1 LSB). This type of dither is usually called "rectangular" owing to its uniform probability distribution function (pdf).
To first order, the output error of the rectangular-dithered quantizer can be considered as a stochastic noise with a mean-square value of:

due to the combination of the intrinsic quantizer error plus the dither.
ReSample includes the option to apply this basic additive rectangular dither.
5.3.3--- Triangular "highpass" additive dither
Another commonly used dither signal is that with a triangular pdf and a peak-to-peak amplitude of 2 LSB. Its use is motivated by the fact that it is theoretically optimal, and, moreover is simple to generate in the digital domain by summing (or differencing) two rectangular dither signals (each with a peak-to-peak amplitude of 1 LSB). In fact, the preferred method in audio applications is to create the triangular dither sequence by differencing successive values of a rectangular dither sequence. This results in an automatic highpass filtering of the dither signal, which, depending on the sample rate, can result in a reduction in the perceived additive noise without affecting the underlying performance of the dither on the quantizer.
To first order, the output error of the triangular-dithered quantizer can be considered as a stochastic noise with a mean-square value of:

due to the combination of the intrinsic quantizer error plus the dither.
ReSample includes the option to apply this highpass triangular additive dither.
5.3.4--- Dither with noise-shaping
The perceived noise due to the dither can be reduced by employing an error-feedback loop around the quantizer, with an appropriately designed "noise-shaping" filter in the feedback path. The key to the technique is to take psychoacoustical advantage of the human hearing curve by designing the noise-shaping filter to be, in effect, the "inverse" of this curve, thereby "moving" the noise into less audible regions of the spectrum.
The general structure of the noise-shaping quantizer is sketched below where H(z) represents the noise-shaping filter (adapted from ref [6] ):

Note that the feedback path incorporates a single unit delay, irrespective of the filter design. This is to eliminate the possibility of an algebraic loop (which would render the network non-computable).
Assuming triangular (2 LSB) dither, then, to first order, the output error of the quantizer with noise-shaping feedback can be considered as a stochastic noise with a mean-square value of (adapted from ref [6] ):
where the quantity

represents the frequency-dependent noise gain factor due to the feedback noise-shaping filter. The other symbols are defined as follows:
Symbol
|
Definition
|

|
frequency (in hertz)
|

|
sample rate (in hertz)
|

|
the complex variable
|
Again, the total residual noise is due to the combination of the intrinsic quantizer error plus the dither, but this time scaled by the feedback filter response. Therefore the total noise depends on the design of the filter.
Recalling that the central purpose of the noise-shaping filter is to reduce the perceived noise, then the quantity of primary interest is the weighted noise power which is computed by taking into account the human hearing threshold curve, denoted W(f), i.e:

The goal of the design of the noise-shaping filter is to find a filter H(z) which minimizes the above integral. Note that generally speaking, any practical filter which lowers the weighted noise, will tend to increase the unweighted noise, so there is usually a tradeoff to be performed when choosing the filter.
Reference [6] presents a range of filters, both FIR (non-recursive) and IIR (recursive), designed to minimize the weighted noise (based on their modified E-weighting curves to represent the human audibility function).
The FIR (non-recursive) filters from ref [6] have been implemented here (since these were found to yield flatter noise spectra than the IIR filters). Specifically, ReSample includes the feedback noise-shaping algorithm for two classes of filter design:
1. Simple delay (i.e. H(z)=1).
2. Arbitrary FIR filter. Any FIR filter may be imported. The default filter is the following five-coefficient FIR filter (specified in ref [6] ): [2.033 -2.165 1.959 -1.590 0.6149]
For the sake of completeness, the core of the ReSample FIR noise-shaper implementation can be summarized as follows:
|
|
Initialize the feedback variable
|
|
|
For current time step, k, do the following:
|
|
|
Add dither to the current sample. The dither consists of the additive noise term minus the noise-shaping filtered feedback term

Note: evaluation of the feedback term involves the computation of a digital filter acting on the historical "error" stream (e)
|
|
|
Convert to integer (and write to WAV file)

|
|
|
Re-scale to a floating-point number within -1:+1

|
|
|
Use this to generate the feedback term for the next time step

Note: uses the result of the filter computation from above (no need to compute it again)
|
Advance to the next time step
|
Note that in all cases where dither is invoked, ReSample incorporates post-dither clipping to ensure that the -1:+1 floating-point range is not inadvertently exceeded due to the addition of the dither.    
Send Page To a Friend
|