|
|
|
|
Send Page To a Friend

|
M-Pack
1: WAV file processing: MATLAB function reference
|
| Version 1.2 |
Requires MATLAB
6.0 (R12) or later |
|
|
|
Performs
sample rate and/or bit-depth conversion on the input
WAV file and saves the results in the output WAV file.
The computations are performed chunk-by-chunk to avoid
the need for large RAM allocations when converting
large files. The following sample rates commonly used
in digital audio are supported (Hz): 8000, 11025,
16000, 22050, 32000, 44100, 48000, 96000, 192000.
Supports
both WAVE_FORMAT_PCM and WAVE_FORMAT_EXTENSIBLE multichannel
uncompressed formats and any bit resolution between
2 and 32 inclusive. As an option, dither may be applied
before re-quantisation with a choice from a variety
of standard dithering methods including noise-shaping
with a user-specified custom FIR shaping filter. |
|
| File
format: |
mex-file
(dll) |
|
Editable
source code:
|
no
|
| Utilises
non-editable functions: |
no |
|
Platform:
|
PC/Windows
|
| Required
MATLAB Toolboxes: |
none
(except core MATLAB) |
| Demo
version limitations: |
30
second WAV file length limit (then silence) |
|
|
|
| |
wavresample(InputWavFile,OutputWavFile,OutputFs) |
| |
wavresample(InputWavFile,OutputWavFile,OutputFs,OutputFormat); |
| |
wavresample(InputWavFile,OutputWavFile,OutputFs,OutputFormat, |
| |
Scaling,ReadChunkSize);
|
| |
|
| |
wavresample(InputWavFile,OutputWavFile,OutputFs,OutputFormat, |
| |
Scaling,ReadChunkSize,WaitbarActivate);
|
|
|
|
| Inputs: |
| InputWavFile |
name of the input WAV file. The file must have one
of the following supported sample rates are (in Hz):
8000, 11025, 16000, 22050, 32000, 44100, 48000, 96000,
192000
|
| OutputWavFile |
name of the output WAV file. Cannot be the same as
the input filename if both reside in the same directory.
|
| OutputFs |
sample rate of output WAV file. Any of the following
supported sample rates may be used (in Hz): 8000,
11025, 16000, 22050, 32000, 44100, 48000, 96000, 192000.
Defaults
to sample rate of input WAV file if not specified.
|
| OutputFormat |
Optional argument. Must be a structure with fields
described as follows (see also wavout):
- Format
field (i.e. wavformat.Format) should be a string
with value 'pcm' (for WAVE_FORMAT_PCM) or 'ext'
(for WAVE_FORMAT_EXTENSIBLE).
- ChannelMask
field (i.e. wavformat.ChannelMask) should contain
the "ChannelMask" variable (in decimal format) which
prescribes the multichannel speaker allocation for
WAVE_FORMAT_EXTENSIBLE. (See chnmsk2spkrlist
to read more about the ChannelMask property.)
-
ValidBits field (i.e. bits.ValidBits): as described
in Notes below
- ContainerBits
field (i.e. bits.ContainerBits): as described in
Notes below
- DitherMethod
field (i.e. bits.DitherMethod): numerical value
indicating which type of dithering method is applied
before quantisation. Possible values are:
- 0:
no dither [default]
- 1:
rectangular PDF dither, with a peak-to-peak
amplitude of 1*LSB
- 2:
triangular PDF dither, with a peak-to-peak amplitude
of 2*LSB
- 3:
triangular PDF with first-order high-pass noise
shaping
- 4:
triangular PDF with custom FIR noise shaping
filter
- DitherGain
field (i.e. bits.DitherGain): gain applied to amplitude
of dither [default value of 1] i.e. rectangular
PDF dither has amplitude of LSB*DitherGain, and
triangular PDF dither has amplitude of 2*LSB*DitherGain.
- NoiseShapeGain
field: (i.e. bits.NoiseShapeGain): gain applied
to the feedback path in the case where where noise
shaping is activated (i.e. for DitherMethod values
greater than 2) [default value of 1]
- NoiseShapeFIR
field: (i.e. bits.NoiseShapeFIR): vector of coefficients
for custom noise shaping filter (only valid for
DitherMethod = 4). An arbitrary filter of any order
may be specified. Default value is the following
fifth order filter: [2.033 -2.165 1.959 -1.590 0.6149]
taken from ref [2] p 851 (note:
designed for 44.1 kHz).
Notes: ValidBits refers to the number of bits used
in the quantisation of each sample, and ContainerBits
refers to number of bits used to store each sample.
ValidBits can have any value from 2 up to and including
ContainerBits. Usually the value of ContainerBits
is the nearest integer multiple of 8 above the value
of ValidBits, but this does not need to be the case
(e.g. a 2-bit quantised signal can be stored in a
32-bit container, though this would be highly wasteful
in terms of disk space!)
If
omitted, output format retains attributes of the input.
|
| Scaling |
Optional
argument. Must be a structure with fields described
as follows:
- Input
field: (i.e. Scaling.Input): [default 'full']. Describes
the scaling law applied when reading the input WAV
file. Identical to the scale
argument in wavin. See
wavin for full range of
options.
- Output
field: (i.e. Scaling.Output): [default 'full'].
Describes the scaling law applied when writing the
output WAV file. Identical to the scale argument
in wavout. See wavout
for full range of
options.
- Gain
field: (i.e. Scaling.Gain): An overall gain applied
to the signal. Useful for changing the overall signal
strength from input to output. If Scaling.Gain has
the same (or greater) number of elements as the
number of channels, each output channel is scaled
by the corresponding gain value. If Scaling.Gain
is scalar, all channels are scaled by the same amount.
If Scaling.Gain is non-scalar but has number of
elements less than the number of channels, the gain
vector is padded (using its last value) to have
the same number of elements as number of channels,
then each output channel is scaled by the corresponding
gain value.
NOTE: for mono signals, the gain is ineffective
when the output scaling is selected as 'norm' or
'normc', since the normalization (associated with
'norm', 'normc') eliminates the effect of the gain.
Likewise, for multichannel signals, the gain is
ineffective when the output scaling is selected
as 'normc' since the per-channel normalization eliminates
the effect of the gain (though the gains are generally
effective in a relative sense when output scaling
is set to 'norm').
|
| ReadChunkSize |
Optional
argument [default value 65536]. Represents the length
(in samples per channel) of each successive chunk
to be processed. Can have an arbitrary value, but
if too small, computational efficiency will be compromised
since the file I/O operations are relatively slow.
If too large, memory requirements may be excessive.
(Note: the value is overridden if the file length
happens to be smaller!)
|
| WaitbarActivate |
Optional
argument. If the argument is present (even if empty),
the progress is displayed via a waitbar which can
be customized via the following optional fields:
- Handle
field (i.e. WaitbarActivate.Handle) the handle of
an externally-generated waitbar figure. If omitted
a new locally-generated waitbar is created (then
deleted at the end of the function)
- Label
field (i.e. WaitbarActivate.Label) string containing
the label of the waitbar
If
this input argument is omitted, no waitbar will be
used.
|
| Note:
any of the following input arguments: OutputFormat,
Scaling, ReadChunkSize, and WaitbarActivate may be substituted
by [ ] to force their respective default value(s).
|
|
| |
|
|
|
Uses
hard-wired multi-stage FIR re-sampling (anti-aliasing
/ anti-imaging) filters written directly in C/C++
and implemented in MATLAB within a compiled mex function
(dll). This mex function calls the wavin
and wavout m-functions to
perform the data I/O.
All
multi-stage filters have been designed to meet the
following low-pass composite specifications as applied
to the input-output data:
- passband
attenuation: <=0.01 dB;
- stopband
attenuation >=120 dB
where
the composite low-pass edge frequency is chosen in
accordance with the given conversion ratio (i.e. at
half the output sample rate for down-samplers to avoid
"aliasing", and at half the input sample
rate for up-samplers to avoid "imaging").
For
convenience, any delays introduced by the filtering
are automatically removed.
For
efficiency, all filters in all stages have been implemented
in polyphase form.
See ref [1] for more information
on multi-stage sample rate conversion and polyphase
representation.
The output WAV file will be a different length than
the input WAV file, in accordance with the ratio of
sample rates (and bit-resolutions if selected).
See
the M-Pack 1 overview
for a detailed discussion of the sample rate conversion,
WAV format, quantization, dithering, and noise-shaping
techniques used in this function.
Note
that wavresample
uses wavin
and wavout
for the WAV file I/O and thereby does not preserve
any peripheral header information (e.g. in the '.info'
field) beyond the basic '.fmt' (audio format information).
Ref[1]:"Multirate
Digital Signal Processing", Ronald E. Crochiere
and Lawrence R. Rabiner, Prentice-Hall, 1983.
Ref[2]:
"Minimally Audible Noise Shaping", Stanley P. Lipshitz,
John Vanderkooy, and Robert A. Wannamaker, J. Audio
Eng. Soc., Vol. 39, No. 11, November 1991.
|
|
|
|
| |
audioresample
|
resampling
of data in the MATLAB workspace (similar to the resample
function in the Signal Processing Toolbox from The MathWorks) |
|
|
|
| The
following examples are contained in the m-script file
entitled xmplwavresample.m |
| |
| Ex.1 |
Convert the sample rate of a 44.1 kHz WAV file ('wavewarp.wav',
human voice recording) to 48 kHz, writing the result
to mywav.wav
(in directory ..\WAVFiles),
keeping all other file attributes the same as the original: |
| |
| |
wavresample('wavewarp.wav','..\WAVfiles\mywav.wav',48000); |
| |
|
| |
Load
the WAV files and plot the original data versus the
resampled data (using the appropriately defined time
vectors). |
| |
|
| |
[y44,fs44]=wavin('..\WAVfiles\wavewarp.wav'); |
| |
[y48,fs48]=wavin('..\WAVfiles\mywav.wav'); |
| |
t44=0:1/44100:0.5;
%time vector for 44.1kHz sample rate |
| |
t48=0:1/48000:0.5; %time vector for 48kHz sample rate |
| |
plot(t44,y44(1:length(t44)),t48,y48(1:length(t48)),'r'); |
| |
|
 |
| |
In
the plot, the resampled data is almost indistinguishable
from the original. Likewise, they sound identical, as
can be conveniently verified using the winplaywav
command to play the WAV files directly from disk: |
| |
winplaywav('..\WAVfiles\wavewarp.wav',2);
%'sync' playback |
| |
winplaywav('..\WAVfiles\mywav.wav'); |
| |
| Ex.2 |
Same
as Ex 1 but invoking the progress-tracking waitbar by
including the 7th argument (even if empty): |
| |
| |
wavresample('wavewarp.wav',...
|
| |
'..\WAVfiles\mywav.wav',48000,[],[],[],[]);
|
| |
| Ex.3 |
Now
convert the original 44.1Khz file to 8kHz: |
| |
wavresample('wavewarp.wav','..\WAVfiles\mywav.wav',8000);
|
| |
|
| |
Plot
the results, this time zooming in to observe the detail
using a staircase plot to emphasize the difference in
sample rates: |
| |
t44=0:1/44100:0.01;
%time vector for 44.1kHz sample rate |
| |
t8=0:1/8000:0.01;
%time vector for 8kHz sample rate |
| |
stairs(t44,y44(1:length(t44)));
hold; stairs(t8,y8(1:length(t8)),'r'); |
| |
|
 |
| |
In
the plot, the reduced sample rate (i.e. 8 kHz versus
44.1 kHz) is clearly evident. Moreover, the 8 kHz data
sounds "muffled" due to the effects of the
anti-aliasing filter (employed by wavresample).
(This can be conveniently verified using the winplaywav
command to play the WAV files directly from disk, as
demonstrated in Ex 2.) |
| |
|
| Ex.4 |
Same
as in previous example but, in addition to the sample
rate reduction, perform a simultaneous re-quantization
from 16- to 8-bits with the application of simple rectangular
dither (with a peak-to-peak amplitude of 1*LSB): |
| |
| |
OutputFormat.ValidBits=8; |
| |
OutputFormat.DitherMethod=1;
|
| |
wavresample('wavewarp.wav',...
|
| |
'..\WAVfiles\mywav.wav',8000,OutputFormat);
|
| |
|
 |
| |
| Ex.5 |
Multichannel
example. Convert the 44.1 kHz WAV 4-channel WAV file
('..\WAVFiles\4channel.wav',
human voice recording) to 192 kHz, writing the result
to mywav.wav
(in directory ..\WAVfiles),
keeping all other file attributes the same as the original.
Also, for the purposes of illustration, override the
default read chunk size to a value of 8192, and invoke
the progress-tracking waitbar (by specifying the last
input argument as present though empty). The conversion
will take considerably longer due to the smaller read
chunk size: |
| |
| |
wavresample('..\WAVFiles\4channel.wav',... |
| |
'..\WAVfiles\mywav.wav',192000,[],[],8192,[]);
|
| |
| Ex.6 |
As
a "roundtrip" test, convert the 44.1 kHz voice
track up to 192 kHz then back down to 44.1 kHz and compare
with original. The plot below shows the results (zoomed-in).
As expected, the "roundtrip" resampled data
is virtually indistinguishable from the original. This
is due to the fact that the original signal is bandlimited
to 22.05 kHz (i.e. the Nyquist rate corresponding to
the sample rate of 44.1 kHz), so the process of upsampling
to 192 kHz and back down again has no discernable effect. |
| |
| |
wavresample('wavewarp.wav','mywav.wav',192000); |
| |
wavresample('mywav.wav','mywav2.wav',44100); |
| |
|
 |
| |
|

Send Page To a Friend
|
|