Dither and Noise Shaping

Frank Klemm's Dither and Noise Shaping Page

Dithering

The noise (original analog signal minus quantized signal) is normally uncorrelated in time and uncorrelated with the signal. This is because the value of the error (-0.5 <= error <= +0.5) is not predictable in any way depending on the last error or depending on the signal itself. This means that the noise is constant and »white«, the power density is (nearly) constant, there are no peaks in it. This is very important, because the human ear can very easily detect

conspicuous spikes in the spectrum
changing noise spectrums, especially »moving« spikes

For especially low level signals the signal is not »white« and you can find »moving« spikes in the spectrum. Searching for such problematic signals shows that there are more general classes:

low level signals
low frequency signals (the steps from sample to sample are very low) with medium and high level!
signals with less noise (very tonal signals)
predictable signals

For demonstrating the problem I have choosen a special synthetic sound. It consists of frequencies between 20 and 200 Hz and I called it »Submarine«. At the end of this page there is a piece of music to shown this is a problem in real music. First the signal in the time and in the frequency domain:

As you can see the signal has a maximum level around -1 dB, not something around -90 dB as you have certainly expected. The dips and spikes are not real, this is a graphical subsampling problem of Cooledit.

This is the spectrum from 0 . . . 500 Hz. The full spectrum range (0 . . . 22.05 kHz) doesn't resolve the 5 spectral lines:

Now we quantize the signal down to 10 bit (i.e. we are simulating the quantization of a 22 bit DAC downto a 16 bit DAC at levels below -36 dB, so everyone can test this with a normal cheap 16 bit sound card setting the full scale SPL to something around 100 . . . 105 dB - 36 dB.

Urghhh, what's that??? This don't look very nice and it even sounds worse . . .
What happened. Lets see the first second of this signal:

You can see the noise signal is a triangle signal with rising frequency. While the amplitude arises the frequency arises. Frequency parts above half the sampling frequency are mirrored down and up, so you can hear a lot of tones. Yes, this is digital audio, not short wave radio ;-)

The first PCM ADC and DAC converters had 14 bit and due to there error they have effectively less than 14 bit. So digital records in the late 70's and the early 80's really sound awfully. This is the reason for topics like "Bad Digital Noise", "Records sounds better than CDs", "Consumers need more than 16 bit/44.1 kHz".

Lets add some noise before quantization:

As you can see adding some noise before quantization removes all ugly quantization effects by making the signal indeterministic.

The disadvantage is that dithering always reduces the Signal-to-Noise-Ratio by about 3 . . . 4 dB.

Enhanced Dithering

Enhanced dithering computes the entropy of the signal and only add noise (entropy) when the signal's entropy falls below a critical value.

The advantage is that this don't reduce the Signal-to-Noise ratio if this is not necessary. Multiple consecutive quanizations only add a dithering signal once, not multiple.

Noise Shaping

The human ear has a different sensitivity for different frequencies (ATH: Absolute Threshold of Hearing). And it's possible to intentionally correlate the noise in time so it has a frequency spectrum which looks like the ATH:

[Noise shaping frequency response]

Noise shaping always increase the absolute unweighted power of the noise, but reduces the audible weighted noise. The effect depends on the ATH in the range from 0 . . . fs/2, typical maximum values you can reach are:

Sample frequency	Audible SNR increasing	technical SNR increasing
8 kHz	4 dB	-20 dB
12 kHz	3 dB	-20 dB
16 kHz	3 dB	-17 dB
22 kHz	4 dB	-10 dB
32 kHz	5 dB	-8 dB
44 kHz	15 dB	-29 dB
48 kHz	18 dB	-29 dB
56 kHz	23 dB	-27 dB
64 kHz	27 dB	-25 dB
72 kHz	30 dB	-23 dB
96 kHz	36 dB	-20 dB

You can see below 15 kHz the noise is reduced, above 15 kHz noise is enlarged. In the range from 2 . . . 5 kHz the noise is minimal. Using 96 kHz makes this much more interesting, because you have the range from 22 . . . 48 kHz for additional noise, so you can reach a SNR of about 135 dB with 16 bit audio or 98 dB with 10 bit audio:

How does the signal look like?

Noise shaping and Dithering

Noise shaped signals have also the problem of noise-signal correlation like non-dithered quantization. But you can still combine noise shaping and dithering. So you have the advantage of noise shaping with the properties of dithering: a constant, but 3 . . . 4 dB increased noise.

Listening Examples:

The first example is the very beginning of Máire Brennan's »Na Paisti« taken from the album »Perfect Time«.

The song begins with a deep growing louder tone which is known as a source of ugly quantization noise.

You can download the quantized WAVE files (10 bit, 48 or 96 kHz) or the result back converted to 44.1 kHz and encoded with MPEGplus.

16 bit / 44.1 kHz, original file
(178 KB, MP+)
(1.15 MB, zipped WAV)
10 bit / 48 kHz, simple rounding
(207 KB, MP+)
(518 KB, zipped WAV)
10 bit / 48 kHz, only dithering
(205 KB, MP+)
(601 KB, zipped WAV)
10 bit / 48 kHz, only noise shaping
(233 KB, MP+)
(792 KB, zipped WAV)
10 bit / 48 kHz, dithering+noise shaping
(233 KB, MP+)
(895 KB, zipped WAV)
10 bit / 96 kHz, noise shaping
(185 KB, MP+)
(1.42 MB, zipped WAV)

Last modified: 2001-04-28 Visitors: