Frank Klemm's Dither and Noise Shaping Page
Dithering
The noise (original analog signal minus quantized signal) is normally
uncorrelated in time and uncorrelated with the signal. This is because the
value of the error (-0.5 <= error <= +0.5) is not predictable in any way
depending on the last error or depending on the signal itself.
This means that the noise is constant and »white«, the power
density is (nearly) constant, there are no peaks in it. This is very
important, because the human ear can very easily detect
- conspicuous spikes in the spectrum
- changing noise spectrums, especially »moving« spikes
For especially low level signals the signal is not »white« and
you can find »moving« spikes in the spectrum. Searching for such
problematic signals shows that
there are more general classes:
- low level signals
- low frequency signals (the steps from sample to sample are very low) with medium and high level!
- signals with less noise (very tonal signals)
- predictable signals
For demonstrating the problem I have choosen a special synthetic sound. It
consists of frequencies between 20 and 200 Hz and I called it
»Submarine«. At the end of this page there is a piece of music
to shown this is a problem in real music.
First the signal in the time and in the frequency domain:
As you can see the signal has a maximum level around -1 dB, not
something around -90 dB as you have certainly expected. The dips and spikes
are not real, this is a graphical subsampling problem of Cooledit.
This is the spectrum from 0 . . . 500 Hz. The full spectrum range
(0 . . . 22.05 kHz) doesn't resolve the 5 spectral lines:
Now we quantize the signal down to 10 bit (i.e. we are simulating
the quantization of a 22 bit DAC downto a 16 bit DAC at levels below -36 dB,
so everyone can test this with a normal cheap 16 bit sound card setting the
full scale SPL to something around 100 . . . 105 dB - 36 dB.
Urghhh, what's that??? This don't look very nice and it even sounds worse . . .
What happened. Lets see the first second of this signal:
You can see the noise signal is a triangle signal with rising frequency.
While the amplitude arises the frequency arises. Frequency parts above half
the sampling frequency are mirrored down and up, so you can hear a lot of
tones. Yes, this is digital audio, not short wave radio ;-)
The first PCM ADC and DAC converters had 14 bit and due to there error they
have effectively less than 14 bit. So digital records in the late 70's and
the early 80's really sound awfully. This is the reason for topics like
"Bad Digital Noise", "Records sounds better than CDs", "Consumers need more
than 16 bit/44.1 kHz".
Lets add some noise before quantization:
As you can see adding some noise before quantization removes all
ugly quantization effects by making the signal indeterministic.
The disadvantage is that dithering always reduces the Signal-to-Noise-Ratio by about 3 . . . 4 dB.
Enhanced Dithering
Enhanced dithering computes the entropy of the signal and only add noise (entropy) when
the signal's entropy falls below a critical value.
The advantage is that this don't reduce the Signal-to-Noise ratio if this is
not necessary. Multiple consecutive quanizations only add a dithering signal
once, not multiple.
Noise Shaping
The human ear has a different sensitivity for different frequencies
(ATH: Absolute Threshold of Hearing). And
it's possible to intentionally correlate the noise in time so it has a
frequency spectrum which looks like the ATH:
Noise shaping always increase the absolute unweighted power of the
noise, but reduces the audible weighted noise. The effect depends on the ATH
in the range from 0 . . . fs/2, typical maximum values you can reach
are:
Sample frequency
| Audible SNR increasing | technical SNR increasing |
8 kHz | 4 dB | -20 dB |
12 kHz | 3 dB | -20 dB |
16 kHz | 3 dB | -17 dB |
22 kHz | 4 dB | -10 dB |
32 kHz | 5 dB | -8 dB |
44 kHz | 15 dB | -29 dB |
48 kHz | 18 dB | -29 dB |
56 kHz | 23 dB | -27 dB |
64 kHz | 27 dB | -25 dB |
72 kHz | 30 dB | -23 dB |
96 kHz | 36 dB | -20 dB |
You can see below 15 kHz the noise is reduced, above 15 kHz noise is
enlarged. In the range from 2 . . . 5 kHz the noise is minimal. Using 96 kHz
makes this much more interesting, because you have the range from 22 . . .
48 kHz for additional noise, so you can reach a SNR of about 135 dB with 16
bit audio or 98 dB with 10 bit audio:
How does the signal look like?
Noise shaping and Dithering
Noise shaped signals have also the problem of noise-signal correlation like
non-dithered quantization. But you can still combine noise shaping and
dithering. So you have the advantage of noise shaping with the properties of
dithering: a constant, but 3 . . . 4 dB increased noise.
Listening Examples:
The first example is the very beginning of Máire Brennan's »Na Paisti« taken from the album »Perfect Time«.
The song begins with a deep growing louder tone which is known as a source of ugly quantization noise.
You can download the quantized WAVE files (10 bit, 48 or 96 kHz) or the result back converted to 44.1 kHz and encoded with MPEGplus.
- 16 bit / 44.1 kHz, original file
(178 KB, MP+)
(1.15 MB, zipped WAV)
- 10 bit / 48 kHz, simple rounding
(207 KB, MP+)
(518 KB, zipped WAV)
- 10 bit / 48 kHz, only dithering
(205 KB, MP+)
(601 KB, zipped WAV)
- 10 bit / 48 kHz, only noise shaping
(233 KB, MP+)
(792 KB, zipped WAV)
- 10 bit / 48 kHz, dithering+noise shaping
(233 KB, MP+)
(895 KB, zipped WAV)
- 10 bit / 96 kHz, noise shaping
(185 KB, MP+)
(1.42 MB, zipped WAV)
Last modified: 2001-04-28 Visitors: 