Clipping of CDs and lossy coded audio

Frank Klemm's Clipping Page


"Several golden ears in the pro audio industry tend to believe that the best
 sound in pop/rock music generally was produced between 1982 and 1995.
 Despite higher resolution in converters and DSP, lower jitter and probably a
 better overall understanding of digital media, we seem to be on a declining
 rather than inclining sound quality slope these years; even though people
 buying records and film may not be aware of it.
 Obviously there could be many reasons for this we cannot directly
 influence: Trends, basic recording and microphone placement skills, more
 semi-pro equipment being used, shorter production times and therefore less
 attention to detail etc. But if the public do not care, why should we?
 Because pride in our industry, craftsmanship and conservation of talent
 tell us to be concerned. And because more bits, more resolution and more
 channels can only be justified by the end quality and listener involvement
 going up."

                          --- Søren H. Nielsen and Thomas Lund (2000) ---


Analog FM radio achieved its promise in the 1970s, when stations in many
communities offered a wide range of excellent-sounding broadcasts. The
ratings race that followed, with its emphasis on making each station the
loudest one on the dial, turned the FM band into a sea of homogenous noise.

Contents:

        Why audio is clipped and how can I hear clipping?
        How can I detect clipping?
        Examples and explanation
        Tables

Why audio is clipped and how can I hear clipping?

In the past the reason for limiting and clipping was to reduce the background noise of magnetic tapes and vinyl disks. At the end of the 70's these medias had a Signal-to-Noise ratio (SNR) between 50 dB and 60 dB. This SNR must be shared by 3 items:

Headroom
Dynamic of the music
Footroom

Studio recording use a headroom of ca. 20 dB to secure avoid distortion. You can't repeat a recording of a famous concert.

To avoid hearing noise in silent passages you need a footroom of at least 10 dB for analog recordings (the first Analog-to-Digital converters need ca. 23 dB).

If your recording device has only 50 dB SNR and you need 20 dB headroom and 10 dB footroom you only have 20 dB usable dynamic. This is much too less! For a good playback you need at least 40...45 dB usable dynamic.

(note that in home audio the full SNR (A weighted) is reported as "dynamic" while in professional audio SNR - headroom - footroom is reported as "usable dynamic" and the SNR is measured by the IEC-601 filter. The difference is something around 40 dB!).

To reduce the noise the headroom was reduced to values between 10 and 14 dB. The audio engineer had the task to reduce the peak levels in a way that this is not obviously audible. But it is audible, but it was much better than 6 . . . 10 dB more noise.

This becomes different with the Compact Disk. If you are using high-quality ADCs and DACs you can reach 98 dB SNR (this is much more than a HighEnd vinyl disk player). Using some tricks you can achieve SNRs up to 113 dB. This is enough for 20 dB headroom, 60 dB usable dynamic and 20 footroom. So there's no need for limiting and clipping.

What methods of clipping are possible?

Hard clipping: Values below and above a limit are rounded to this limit
Soft clipping: A S-shaped transition characteristics
Dynamic limiting: A linear transition characteristics which is dynamically changed
combinations of the 3 other types.

All methods are reducing the dynamic of the music title and . . .

Hard clipping: high order distortion (k3, k5, k7, k9, k11, ...) for the overdriven parts
Soft clipping: low order distortions (k3) for higher levels far below the maximum level.
Dynamic limiting: Less distortions than the other, but pumping and signal modulation

In the past people talked about the differences between k2 and k3. With hard clipping we introduce high order distortions we only know from PLL-FM receivers (TV).

Media	SNR	headroom	usable dynamic	footroom	distortion
Low quality analog tape (LH + Ghettoblaster)	48 dB	8 dB	30 dB	10 dB	5%
High quality analog tape (metal + HQ tape deck)	68 dB	16 dB	42 dB	10 dB	3%
High quality analog tape (metal + HQ tape deck + Dolby S)	88 dB	16 dB	60 dB	12 dB	1.5%
AM radio (local station)	40 dB	6 dB	24 dB	10 dB	5%
FM radio (local station)	68 dB	16 dB	42 dB	12 dB	0.8%
vinyl disk	65 dB	16 dB	39 dB	12 dB	1%
CD-DA (early 14 bit recordings)	82 dB	16 dB	43 dB	23 dB	<0.1%
CD-DA (current 16 bit recordings without dithering)	98 dB	20 dB	55 dB	23 dB	<0.01%
CD-DA (current 16 bit recordings with dithering)	95 dB	20 dB	65 dB	10 dB	<0.01%
CD-DA (high bit recordings with advanced tranfer)	113 dB	20 dB	83 dB	10 dB	<0.01%

You have the opportunity to reduce headroom and immediately you increase usable dynamic and distortions.

How can I detect clipping?

Most people are searching for multiple samples with the maximum possible level. This is a very simple method which is not able to find all sorts of clippings. There are more sophisticated statistical methods. The easiest is to make a normal distribution test.

You plot a line into a diagram:

y is the level
x is the probabilitity that the level is below y
y is divided into equidistant intervals

x is divided by a special function called invers gaussian distribution, called sigma

deviation	propabiltity	deviation	propabiltity
0.0 sigma	50.000000%	-0.0 sigma	50.000000%
0.5 sigma	69.146246%	-0.5 sigma	30.853754%
1.0 sigma	84.134475%	-1.0 sigma	15.865525%
1.5 sigma	93.319280%	-1.5 sigma	6.680720%
2.0 sigma	97.724987%	-2.0 sigma	2.275013%
2.5 sigma	99.379033%	-2.5 sigma	0.620967%
3.0 sigma	99.865010%	-3.0 sigma	0.134990%
3.5 sigma	99.976737%	-3.5 sigma	0.023263%
4.0 sigma	99.996833%	-4.0 sigma	0.003167%
4.5 sigma	99.999660%	-4.5 sigma	0.000340%
5.0 sigma	99.999971%	-5.0 sigma	0.000029%

A normal distibuted signal generates a linear plot in this diagram. This is independent from the frequency statistics, because temporal dependencies are not used.

Test:

Okay. Seems to work very well.

Examples and explanation

First some plots.
On the y-axis you see the amplitude (scaled for 16 bit), on the x-axis the propability scaled in sigma.
What is a sigma?
If you have noise "1 sigma" is exact the RMS level of this noise. The theory says that noise should generate a straight line in this diagram: Theory seems to work for noise . . . You see some stairs at the end, especially for the short recordings.

Now lets look at a demonstration CD of Philips/Du Pont Optical in 1985:

The difference is that the dynamic of this recording (noise has no dynamic) shapes the curves in the middle. They are distored in the middle, a little bit like a S. The outer ranges are still linear with the small stairs at the end.

Now some music with less dynamic. Deep Purple. An old recording from the 80's:

The S shape is much less. Now we come to the mathematical part. Enlarge the lines to the x-axis' and calculate the intersection with it. For the pink line it is -5.5 and +5.3, for the brown line -5.5 and +5.2.

. At the end there's a table to estimate the number of clipped samples. This corresponds with 35...60 clippings per hours. The sligt saturation has nothing to do with soft clipping. This is an old recording and you probably see beginning saturation effects of analog audio tapes. If you thing 35...60 clippings/hour is bad, see the next diagrams . . .

Soft clipping/limiting looks different. See this recording of a band known for their ear damaging concerts:

"Another Ring of Fire" and "One way out" are soft-clipped.
Lets estimate the clippings of "Little Beggarman". The intersections are -3.8 and 3.6, this gives 20 clippings per second. In the same region the number of falsified samples of the soft clipped songs is. 20 clippings is really terrible. Can it becomre more worse?

We see hard clippings at +/-2.6. These are 800 . . . 900 clippings per second. These are clippings at 2%. An we see another nasty effect:
Non FS clipping at +/-2.5 at levels about +/- 27500. Even if not full scale these are around 1100 clippings per second. 2.5%.

Can it become worser? I only found historic recordings by Walter Ulbricht:

This heavy distortions (and also the distored frequency response) give this typical sound of historic recordings. Maybe in 2020 we have the same quality. Distored in the same way, but only bass and treble and no mids . . .

Tables

sigma	probability of clipping [ppm]	every n-th sample is clipped	clippings
0.0	1000000	1.00	88200 / sec
0.1	920344	1.09	81174 / sec
0.2	841480	1.19	74219 / sec
0.3	764177	1.31	67400 / sec
0.4	689156	1.45	60784 / sec
0.5	617075	1.62	54426 / sec
0.6	548506	1.82	48378 / sec
0.7	483927	2.07	42682 / sec
0.8	423710	2.36	37371 / sec
0.9	368120	2.72	32468 / sec
1.0	317310	3.15	27987 / sec
1.1	271332	3.69	23931 / sec
1.2	230139	4.35	20298 / sec
1.3	193600	5.17	17076 / sec
1.4	161513	6.19	14245 / sec
1.5	133614	7.48	11785 / sec
1.6	109598	9.12	9667 / sec
1.7	89130	11.22	7861 / sec
1.8	71860	13.92	6338 / sec
1.9	57433	17.41	5066 / sec
2.0	45500	21.98	4013 / sec
2.1	35728	27.99	3151 / sec
2.2	27806	35.96	2453 / sec
2.3	21448	46.62	1892 / sec
2.4	16395	60.99	1446 / sec
2.5	12419	80.52	1095 / sec
2.6	9322	107.27	822 / sec
2.7	6933	144.22	612 / sec
2.8	5110	195.68	451 / sec
2.9	3731	267.98	329 / sec
3.0	2699	370.40	238 / sec
3.1	1935	516.74	171 / sec
3.2	1374	727.66	121 / sec
3.3	966	1034	85.276 / sec
3.4	673	1484	59.434 / sec
3.5	465	2149	41.036 / sec
3.6	318	3143	28.067 / sec
3.7	215	4638	19.016 / sec
3.8	144	6911	12.762 / sec
3.9	96.193	10396	8.484 / sec
4.0	63.342	15787	5.587 / sec
4.1	41.315	24204	3.644 / sec
4.2	26.691	37465	2.354 / sec
4.3	17.080	58549	1.506 / sec
4.4	10.825	92378	57.286 / min
4.5	6.795	147160	35.961 / min
4.6	4.225	236691	22.358 / min
4.7	2.602	384377	13.768 / min
4.8	1.587	630256	8.397 / min
4.9	0.958	1043442	5.072 / min
5.0	0.573	1744278	3.034 / min
5.1	0.340	2944177	1.797 / min
5.2	0.199	5017850	1.055 / min
5.3	0.116	8635379	36.770 / hour
5.4	0.067	15.0e6	21.160 / hour
5.5	0.038	26.3e6	12.059 / hour
5.6	0.021	46.7e6	6.806 / hour
5.7	0.012	83.5e6	3.804 / hour
5.8	0.007	150.8e6	2.106 / hour
5.9	0.004	275.1e6	1.154 / hour
6.0	0.002	506.8e6	0.627 / hour
sigma	probability of clipping [ppm]	every n-th sample is clipped	clippings

Elektor Article (German)
Back to main page
Last modified: 2001-11-28 Visitors: