Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Limits of Lossless Compression...? (Read 13599 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Limits of Lossless Compression...?

Hi,
As the lossless audio coding is getting more and more popular, there is a question that what is the limit of lossless compression ???
Will we get the compression upto 20-25% of the original size at some point of time ??  To comprare it with lossy coding, in lossy coding we have perceptual models which are the main engins behind high compression but here we dont have the perceptual models, we only have mathematical models to improve the compression.
I think, the question of compression is important because this is the main reason of lesser popularity of lossless audio coders.
Again, I am not saying that we can get the compression equal to the lossy one but what is limit........???
we are just close to 50% right now.....
And  we can also discuss the possible directions to move on... I mean to discover the loopholes and proceed accordingly. In particular one can compare the power of compression the entropy coding has and the new innovative schemes for better prediction  and get smaller residual signal to get better compression.  You could also throw light upon vector quantization and codebook based predictions etc.

The main point here is to discuss the limits of lossless compression, Will we get it upto 20-25% of the original size ???

thanks

Limits of Lossless Compression...?

Reply #1
It seems unlikely, but I wouldn't say it's entirely impossible. I would however imagine it being very processor intensive, and require new approaches to the maths involved in compression. One problem with such a huge amount of compression is that with each bit that a file increases in size, the possible combinations of data it represents doubles. Considering that, a smaller file would have many less possible combinations of wave forms that it could represent.
Acid8000 aka. PhilDEE

Limits of Lossless Compression...?

Reply #2
Quote
Posted Today, 11:37 AM
It seems unlikely, but I wouldn't say it's entirely impossible. I would however imagine it being very processor intensive, and require new approaches to the maths involved in compression.

thanks Acid8000 for starting it off and your views,
What I am concerned here is the "compression".  Today the increased processing capabilities and advances in memory technology will take care of other things..... so at the cost of complexity also we can have better compression. Or atleast we would like to discuss here that if we dont have the constraint of complexity and memory, how much we can compress now or in future... what is the limit ??
And also how we can achieve it....  what could be the possible paths to follow....??
thanks

Limits of Lossless Compression...?

Reply #3
Quote
What I am concerned here is the "compression".   Today the increased processing capabilities and advances in memory technology will take care of other things..... so at the cost of complexity also we can have better compression. Or atleast we would like to discuss here that if we dont have the constraint of complexity and memory, how much we can compress now or in future... what is the limit ??
And also how we can achieve it....  what could be the possible paths to follow....??
thanks


If the compression is to be truly lossless, then from information theory there will be some theoretical limit beyond which it is impossible to compress. This limit is dependent upon the amount of information in a particular audio file.

The interesting question here is how far are current lossless schemes from this theoretical limit?

Limits of Lossless Compression...?

Reply #4
A signal's entropy should be quite easy to compute, IIRC. If so, it would be interesting to see this ratio between actual compression and theoretical max compression in the output when you encode something. (Small hint to wavpack and flac developers  ).

Limits of Lossless Compression...?

Reply #5
Quote
The interesting question here is how far are current lossless schemes from this theoretical limit?

Yes, this is the most interesting question... and we need to find ways to go towards the limit....
If we know where  we are, it gives us a hint to proceed.....

Limits of Lossless Compression...?

Reply #6
Well you'll never be able to compress a normal wav file down to say one bit losslessly. A single bit can only contain two sets of information, whereas the wav would contain several million variants. So the limit would be how many sets of unique information a set number of bits can hold.

Edit: Note, above is an obscenely extreme example.
Acid8000 aka. PhilDEE

Limits of Lossless Compression...?

Reply #7
Quote
A signal's entropy should be quite easy to compute, IIRC.

How? I believe so far I've only seen equations that are a measure for the entropy.

But indeed, the entropy in a signal determines its maximum compression ratio.

Limits of Lossless Compression...?

Reply #8
Quote
A signal's entropy should be quite easy to compute, IIRC. If so, it would be interesting to see this ratio between actual compression and theoretical max compression in the output when you encode something. (Small hint to wavpack and flac developers  ).
[a href="index.php?act=findpost&pid=339454"][{POST_SNAPBACK}][/a]

Although it's easy to measure entropy, there are many different kinds of entropy to measure. For example, I could give a string of 10000 numbers which any calculation would show to be extremely random. However, there could easily be a hidden generator which the entropy calculators wouldn't catch. What if my string was the 10000 digits of pi starting from digit 50000? That information could be losslessly encoded extremely well, given that the compression scheme knows about pi.

The result of this is that I can create a codec which can make any song as small as I want. But in order to make one song one byte smaller, at least 256 songs have to get larger. I could take a codec which calculates the difference between the input song and "Stairway To Heaven" by Led Zepplin. If the input song is "Stairway To Heaven", it can be encoded as one bit! All the other songs would suffer, but if you find yourself making lots of encodes of that one song, it would be worth it! (it would be more worth it to use a playlist and get your brain checked, but that's beside the point...)

This relates to what Acid8000 just posted: assuming your filesystem can handle it, you can have exactly one wav file down to one bit losslessly, but it wouldn't be too smart. The problem with this approaches is that your codec would have to have "Stairway To Heaven" in it so it will know what to play. However, if each file contained a list of indexes into a multi-gigabyte "most common" database, then it would be possible to make a handful of songs extremely small. (Hey! That sounds a lot like MIDI!  )

What I'm trying to say (it's past my bedtime) is that entropy measurements won't be of any use unless they work similarly to the target codec. But that implies that you already know how the new codec will work, so it would be easier to simply use the new codec as an entropy measurement!

(Does any of that make sense? It's very late... I'll deal with readability in the morning  )
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

Limits of Lossless Compression...?

Reply #9
Quote
How? I believe so far I've only seen equations that are a measure for the entropy.

But indeed, the entropy in a signal determines its maximum compression ratio.
[a href="index.php?act=findpost&pid=339469"][{POST_SNAPBACK}][/a]


Eh? Did I mix up quantities with measures or something?

It wasn't really relevant anyway, since the goal of the predictor in the various compression algorithms is to lower the entropy of the "residual" as much as possible in order to compress it effectively with the entropy coder.

Still, it would be interesting to hear what David, Josh, Ghido etc have to say about this topic...

Limits of Lossless Compression...?

Reply #10
Quote
(Does any of that make sense? It's very late... I'll deal with readability in the morning  )
[a href="index.php?act=findpost&pid=339473"][{POST_SNAPBACK}][/a]



Yes, very clear. And when I came back now, I just realized it too. I blame my first post here on temporary insanity.

Limits of Lossless Compression...?

Reply #11
The problem with lossless audio coding is that you try to encode
a noisy source with a noiseless compression scheme.
Current predictive coding solution are suboptimal in this case.
A solution would be a high-order ppm coder but the samplesize
is too large and memory consumtion would be enormous.
IMO an improvement of about 10% is possible.

Limits of Lossless Compression...?

Reply #12
Quote
A solution would be a high-order ppm coder but the samplesize
is too large and memory consumtion would be enormous.

Hmm. it is interesting what you say.
I took a quick look at  Lossless audio compression test to check this out somewhat: 
Winrar w/ PPM or 7z PPMd (which I believe is a similar idea) have achieved mediocre results.  A great compressor (but suited esp. for texts), durilca, even with 900MB memory is clearly worse than FLAC and Wavpack there.

Yet, SBC (all-round archiver) achieves great results with PCM (anybody knows what algorythm it uses for this?)

Is it just that all-round compressors are not tuned for PCM compression to maintain versatility or I am missing something from the test? And what exacty does high-order ppm mean?

Limits of Lossless Compression...?

Reply #13
Quote
Winrar w/ PPM or 7z PPMd (which I believe is a similar idea) have achieved mediocre results.  A great compressor (but suited esp. for texts), durilca, even with 900MB memory is clearly worse than FLAC and Wavpack there.


these programs are mostly optimized for byte aligned data eg. txt, exe
but you to collect contexts on an even wider range for sound.
thats why i wrote enormous

Quote
Yet, SBC (all-round archiver) achieves great results with PCM (anybody knows what algorythm it uses for this?)


to my knowledge sbc uses burrows-wheeler-transform
with some predictors to pre-filter multimedia data

an example for high-order ppm contexts suited for sound
are the latest paq sources...but you can extend the initial ideas

Limits of Lossless Compression...?

Reply #14
Bonjour,

Here is a question of a end user:
Basing that idea on an experience of composing sometimes using sampling.
Emulated by this topic.
Considering that the compression is based on the re-use of sounds already in the data of 1 sound track marked with a code name and recalled following rythme of loops.
Why not using the same principe for a whole music database.
A certain quantity of music is using sampling and exactly same sounds. Also consider sounds fashion wich is highly changing but used in mass at a certain time.
A certain hierarchy in classification of this sounds like we have with our programs should short access.
Should that be possible to use that way to reduce music database?
Thanks

Limits of Lossless Compression...?

Reply #15
Quote
Considering that the compression is based on the re-use of sounds already in the data of 1 sound track marked with a code name and recalled following rythme of loops.
Why not using the same principe for a whole music database.

This idea of audio "compression" is quite old, just have a look at MOD and MIDI files, for instance.

I have no idea if somebody has ever tried to write a program that converts normal audio tracks (e.g. from CDs) into MOD- or MIDI-files in order to reduce the amount of data. This conversion seems to be a very complex task, and I doubt that it will ever be useful for lossless audio compression...

Limits of Lossless Compression...?

Reply #16
Quote
I have no idea if somebody has ever tried to write a program that converts normal audio tracks (e.g. from CDs) into MOD- or MIDI-files in order to reduce the amount of data. This conversion seems to be a very complex task, and I doubt that it will ever be useful for lossless audio compression...
[a href="index.php?act=findpost&pid=341102"][{POST_SNAPBACK}][/a]


Yes, this has been thought several times, especially when you want the sample of a song (say, a voice), but don't have the master.
The problem (being separate the independent sounds of an audio signal) is still unresolved AFAIK.

Limits of Lossless Compression...?

Reply #17
Unless the complexity of music increases faster than available storage-spaces, your "optimum compression-ratio" will never be reached... not even close. Why? Because its inefficient.

In an optimal world, we would first try to optimize use of resources to death before using up more resources - but unfortunatelly, thats not how the world goes. The flaw in your logic is that you only take increase of CPU & RAM resources into account, but ignore increasing storage-space resources. Thus, there is not just one resource which determines efficiency, but two - processing-speeds and storage-space. But wait - theres even a third resource - development effort.

And because of this, the "optimum" will not be the algorithm which compresses as much as possible.... instead, it will be the one which "compresses good enough for current and near-future storage-spaces, while requiring the least amount of processing-time and development efforts".

Lossless not being more popular is indeed because of its higher usage of storage space...... but long before the root of the problem(compression-ratio) will be solved, storage-space will have multiplied, therefore rendering the issue void.
I am arrogant and I can afford it because I deliver.

Limits of Lossless Compression...?

Reply #18
Quote
But indeed, the entropy in a signal determines its maximum compression ratio.


Right Claude Shannon elogantly pointed that out for us in some of his formalisms for Information Theory. Minus all of the other "exotic mathmatical transforms" we have today.

Information Theory: Section III

Quote
Or atleast we would like to discuss here that if we dont have the constraint of complexity and memory, how much we can compress now or in future... what is the limit ??


The general nth order model is described in his paper.
budding I.T professional

Limits of Lossless Compression...?

Reply #19
You can add bandwidth as another factor influencing the use (and sale) of lossless music.

Limits of Lossless Compression...?

Reply #20
Quote
And because of this, the "optimum" will not be the algorithm which compresses as much as possible.... instead, it will be the one which "compresses good enough for current and near-future storage-spaces, while requiring the least amount of processing-time and development efforts".
[a href="index.php?act=findpost&pid=341181"][{POST_SNAPBACK}][/a]


Hmm interesting. I never thought about the practical aspects.
Acid8000 aka. PhilDEE

 

Limits of Lossless Compression...?

Reply #21
Well, firstly, SBC afaik compresses WAVs using an adaptive method similar to other lossless encoders when compressing WAV files, rather than using burrows-wheeler or ppm or whatever it uses for other files.

I'm very dubious that ppm could contribute to a 10% gain in compression as one poster commented. Firstly, some adaptive predictors can cope pretty well with noise. Secondly, I played with burrows-wheeler and similar methods and as many others have found they're not really suited to audio compression. Still, I'd be interested to hear more elaboration on the comment that 'an example for high-order ppm contexts suited for sound are the latest paq sources...'

As others have pointed out, theres no way to easily determine what the maximum compression ratio would be for various obvious reasons.

Regarding storage space - when my wristwratch holds a few terabytes of information, then it may cease to be an issue for lossless audio. Until then, any extra compression won't hurt.

Limits of Lossless Compression...?

Reply #22
Quote
Still, I'd be interested to hear more elaboration on the comment that 'an example for high-order ppm contexts suited for sound are the latest paq sources...'


PAQ ?

Limits of Lossless Compression...?

Reply #23
Well the PAQ link you posted links to a site with absolutely no information regarding how it works except instructing to look at the source code comments. Not really what I was asking for sorry.

The site does also link to some comparisons showing PAQ performing nowhere near adaptive lossless compressors for audio. So you're going to have to do a lot better than that to convince me theres any potential for such techniques improving lossless audio compression by around 10% as you claimed.

Limits of Lossless Compression...?

Reply #24
Quote
So you're going to have to do a lot better than that to convince me theres any potential for such techniques improving lossless audio compression by around 10% as you claimed.


First, i didn't claimed that PAQ is anything better than the current state of the art
of lossless audio. i claimed that the techniques uses eg. context-mixing,
different probability models and second-symbol-estimation
could be used for audio as well. PAQ is still optimized
for byte-data. i also wrote that there is a massive computional overload
if you use these techniques on audio. I secondly don't understand why someone
could try a bwt on audio. This is senseless. Why are the state of
the art image and text compressors markov-model-based? You can still
compress a picture good enough with a adaptive predictor and some
error-feedback coder, but you discard a lot of information. I think
you don't have a deep enough understanding to challenge my personal opinion.