"Several golden ears in the pro audio industry tend to believe that the best sound in pop/rock music generally was produced between 1982 and 1995. Despite higher resolution in converters and DSP, lower jitter and probably a better overall understanding of digital media, we seem to be on a declining rather than inclining sound quality slope these years; even though people buying records and film may not be aware of it. Obviously there could be many reasons for this we cannot directly influence: Trends, basic recording and microphone placement skills, more semi-pro equipment being used, shorter production times and therefore less attention to detail etc. But if the public do not care, why should we? Because pride in our industry, craftsmanship and conservation of talent tell us to be concerned. And because more bits, more resolution and more channels can only be justified by the end quality and listener involvement going up." --- Søren H. Nielsen and Thomas Lund (2000) ---
Analog FM radio achieved its promise in the 1970s, when stations in many communities offered a wide range of excellent-sounding broadcasts. The ratings race that followed, with its emphasis on making each station the loudest one on the dial, turned the FM band into a sea of homogenous noise.
Contents:
Why audio is clipped and how can I hear clipping?
How can I detect clipping?
Examples and explanation
Tables
Studio recording use a headroom of ca. 20 dB to secure avoid distortion. You can't repeat a recording of a famous concert.
To avoid hearing noise in silent passages you need a footroom of at least 10 dB for analog recordings (the first Analog-to-Digital converters need ca. 23 dB).
If your recording device has only 50 dB SNR and you need 20 dB headroom and 10 dB footroom you only have 20 dB usable dynamic. This is much too less! For a good playback you need at least 40...45 dB usable dynamic.
(note that in home audio the full SNR (A weighted) is reported as "dynamic" while in professional audio SNR - headroom - footroom is reported as "usable dynamic" and the SNR is measured by the IEC-601 filter. The difference is something around 40 dB!).
To reduce the noise the headroom was reduced to values between 10 and 14 dB. The audio engineer had the task to reduce the peak levels in a way that this is not obviously audible. But it is audible, but it was much better than 6 . . . 10 dB more noise.
This becomes different with the Compact Disk. If you are using high-quality ADCs and DACs you can reach 98 dB SNR (this is much more than a HighEnd vinyl disk player). Using some tricks you can achieve SNRs up to 113 dB. This is enough for 20 dB headroom, 60 dB usable dynamic and 20 footroom. So there's no need for limiting and clipping.
What methods of clipping are possible?
All methods are reducing the dynamic of the music title and . . .
Media | SNR | headroom | usable dynamic | footroom | distortion |
Low quality analog tape (LH + Ghettoblaster) | 48 dB | 8 dB | 30 dB | 10 dB | 5% |
High quality analog tape (metal + HQ tape deck) | 68 dB | 16 dB | 42 dB | 10 dB | 3% |
High quality analog tape (metal + HQ tape deck + Dolby S) | 88 dB | 16 dB | 60 dB | 12 dB | 1.5% |
AM radio (local station) | 40 dB | 6 dB | 24 dB | 10 dB | 5% |
FM radio (local station) | 68 dB | 16 dB | 42 dB | 12 dB | 0.8% |
vinyl disk | 65 dB | 16 dB | 39 dB | 12 dB | 1% |
CD-DA (early 14 bit recordings) | 82 dB | 16 dB | 43 dB | 23 dB | <0.1% |
CD-DA (current 16 bit recordings without dithering) | 98 dB | 20 dB | 55 dB | 23 dB | <0.01% |
CD-DA (current 16 bit recordings with dithering) | 95 dB | 20 dB | 65 dB | 10 dB | <0.01% |
CD-DA (high bit recordings with advanced tranfer) | 113 dB | 20 dB | 83 dB | 10 dB | <0.01% |
Most people are searching for multiple samples with the maximum possible level. This is a very simple method which is not able to find all sorts of clippings. There are more sophisticated statistical methods. The easiest is to make a normal distribution test.
You plot a line into a diagram:
deviation | propabiltity | deviation | propabiltity |
0.0 sigma | 50.000000% | -0.0 sigma | 50.000000% |
0.5 sigma | 69.146246% | -0.5 sigma | 30.853754% |
1.0 sigma | 84.134475% | -1.0 sigma | 15.865525% |
1.5 sigma | 93.319280% | -1.5 sigma | 6.680720% |
2.0 sigma | 97.724987% | -2.0 sigma | 2.275013% |
2.5 sigma | 99.379033% | -2.5 sigma | 0.620967% |
3.0 sigma | 99.865010% | -3.0 sigma | 0.134990% |
3.5 sigma | 99.976737% | -3.5 sigma | 0.023263% |
4.0 sigma | 99.996833% | -4.0 sigma | 0.003167% |
4.5 sigma | 99.999660% | -4.5 sigma | 0.000340% |
5.0 sigma | 99.999971% | -5.0 sigma | 0.000029% |
Test:
Okay. Seems to work very well.
First some plots.
On the y-axis you see the amplitude (scaled for 16 bit), on the x-axis the
propability scaled in sigma.
What is a sigma?
If you have noise "1 sigma" is exact the RMS level of this noise. The theory
says that noise should generate a straight line in this diagram:
Theory seems to work for noise . . . You see some stairs at the end,
especially for the short recordings.
Now lets look at a demonstration CD of Philips/Du Pont Optical in 1985:
The difference is that the dynamic of this recording (noise has no dynamic)
shapes the curves in the middle. They are distored in the middle, a little
bit like a S. The outer ranges are still linear with the
small stairs at the end.
Now some music with less dynamic. Deep Purple. An old recording from the
80's:
The S shape is much less. Now we come to the mathematical
part. Enlarge the lines to the x-axis' and calculate the intersection with
it. For the pink line it is -5.5 and +5.3, for the brown line -5.5 and +5.2.
. At the end there's a table to estimate the number of clipped
samples. This corresponds with 35...60 clippings per hours. The sligt
saturation has nothing to do with soft clipping. This is an old recording
and you probably see beginning saturation effects of analog audio tapes.
If you thing 35...60 clippings/hour is bad, see the next diagrams . . .
Soft clipping/limiting looks different. See this recording of a band known
for their ear damaging concerts:
"Another Ring of Fire" and "One way out" are soft-clipped.
Lets estimate the clippings of "Little Beggarman". The intersections are
-3.8 and 3.6, this gives 20 clippings per second. In the same region the
number of falsified samples of the soft clipped songs is.
20 clippings is really terrible. Can it becomre more worse?
We see hard clippings at +/-2.6. These are 800 . . . 900 clippings per
second. These are clippings at 2%. An we see another nasty effect:
Non FS clipping at +/-2.5 at levels about +/- 27500. Even if not full scale
these are around 1100 clippings per second. 2.5%.
Can it become worser? I only found historic recordings by Walter Ulbricht:
This heavy distortions (and also the distored frequency response) give this
typical sound of historic recordings. Maybe in 2020 we have the same
quality. Distored in the same way, but only bass and treble and no mids . . .
sigma | probability of clipping [ppm] | every n-th sample is clipped | clippings |
0.0 | 1000000 | 1.00 | 88200 / sec |
0.1 | 920344 | 1.09 | 81174 / sec |
0.2 | 841480 | 1.19 | 74219 / sec |
0.3 | 764177 | 1.31 | 67400 / sec |
0.4 | 689156 | 1.45 | 60784 / sec |
0.5 | 617075 | 1.62 | 54426 / sec |
0.6 | 548506 | 1.82 | 48378 / sec |
0.7 | 483927 | 2.07 | 42682 / sec |
0.8 | 423710 | 2.36 | 37371 / sec |
0.9 | 368120 | 2.72 | 32468 / sec |
1.0 | 317310 | 3.15 | 27987 / sec |
1.1 | 271332 | 3.69 | 23931 / sec |
1.2 | 230139 | 4.35 | 20298 / sec |
1.3 | 193600 | 5.17 | 17076 / sec |
1.4 | 161513 | 6.19 | 14245 / sec |
1.5 | 133614 | 7.48 | 11785 / sec |
1.6 | 109598 | 9.12 | 9667 / sec |
1.7 | 89130 | 11.22 | 7861 / sec |
1.8 | 71860 | 13.92 | 6338 / sec |
1.9 | 57433 | 17.41 | 5066 / sec |
2.0 | 45500 | 21.98 | 4013 / sec |
2.1 | 35728 | 27.99 | 3151 / sec |
2.2 | 27806 | 35.96 | 2453 / sec |
2.3 | 21448 | 46.62 | 1892 / sec |
2.4 | 16395 | 60.99 | 1446 / sec |
2.5 | 12419 | 80.52 | 1095 / sec |
2.6 | 9322 | 107.27 | 822 / sec |
2.7 | 6933 | 144.22 | 612 / sec |
2.8 | 5110 | 195.68 | 451 / sec |
2.9 | 3731 | 267.98 | 329 / sec |
3.0 | 2699 | 370.40 | 238 / sec |
3.1 | 1935 | 516.74 | 171 / sec |
3.2 | 1374 | 727.66 | 121 / sec |
3.3 | 966 | 1034 | 85.276 / sec |
3.4 | 673 | 1484 | 59.434 / sec |
3.5 | 465 | 2149 | 41.036 / sec |
3.6 | 318 | 3143 | 28.067 / sec |
3.7 | 215 | 4638 | 19.016 / sec |
3.8 | 144 | 6911 | 12.762 / sec |
3.9 | 96.193 | 10396 | 8.484 / sec |
4.0 | 63.342 | 15787 | 5.587 / sec |
4.1 | 41.315 | 24204 | 3.644 / sec |
4.2 | 26.691 | 37465 | 2.354 / sec |
4.3 | 17.080 | 58549 | 1.506 / sec |
4.4 | 10.825 | 92378 | 57.286 / min |
4.5 | 6.795 | 147160 | 35.961 / min |
4.6 | 4.225 | 236691 | 22.358 / min |
4.7 | 2.602 | 384377 | 13.768 / min |
4.8 | 1.587 | 630256 | 8.397 / min |
4.9 | 0.958 | 1043442 | 5.072 / min |
5.0 | 0.573 | 1744278 | 3.034 / min |
5.1 | 0.340 | 2944177 | 1.797 / min |
5.2 | 0.199 | 5017850 | 1.055 / min |
5.3 | 0.116 | 8635379 | 36.770 / hour |
5.4 | 0.067 | 15.0e6 | 21.160 / hour |
5.5 | 0.038 | 26.3e6 | 12.059 / hour |
5.6 | 0.021 | 46.7e6 | 6.806 / hour |
5.7 | 0.012 | 83.5e6 | 3.804 / hour |
5.8 | 0.007 | 150.8e6 | 2.106 / hour |
5.9 | 0.004 | 275.1e6 | 1.154 / hour |
6.0 | 0.002 | 506.8e6 | 0.627 / hour |
sigma | probability of clipping [ppm] | every n-th sample is clipped | clippings |