Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Temporal redundancy removal in compressed audio (Read 4649 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Temporal redundancy removal in compressed audio

Hi!

If you were looking for a paper about improving vorbis compression by using
arithmetic coding and exploiting temporal redundancies (i.e. repeating sound)
you might have a look here:

http://web.interware.hu/rudas

Three ogg vorbis files were compressed losslessly, with compression ratios
between 2 - 8 % and compression time about 1/500 realtime (i.e. slow).
Theoretical upper bound of the compression for real music is estimated to be
between 10 and 20 % - further improvements to the methods are possible.

bye
Denes

Temporal redundancy removal in compressed audio

Reply #1
I assume decompression is also slower than realtime?

Temporal redundancy removal in compressed audio

Reply #2
Quote
I assume decompression is also slower than realtime?
[a href="index.php?act=findpost&pid=324639"][{POST_SNAPBACK}][/a]

I would not think that the decrompression would be slow because the computation time was used in finding the temporal redundancies.

 

Temporal redundancy removal in compressed audio

Reply #3
What have we learned from that (Some may have known this already) ?
Extra-Long-term temporal prediction isn't really useful/practical.

Kudos to the author for going through the hassle of implementing that stuff, though.

An area I think one could try to explore:
(Short-term) Temporal & interchannel-prediction of the floor curves and residue codebook selection side infos instead.

Let me quote this part:
Quote
However, the most striking observation is that the MDCT transform, as used currently, is not well suited for finding and exploting temporal redundancy. The most likely cause of this is the MDCT's lack of translational invariance.

Possibly transforms that are closer to being translation invariant are necessary, these could be:
- A close model of the ear (not necessarily critically sampled or with the property of perfect reconstruction)
- wavelet packets or similar transforms
- mdct and mdst, perhaps tuning the angle with respect to the amplitude


While I agree with the first paragraph I fail to see how a wavelet packet transform is any better when it comes to shift invariance compared to the MDCT. With MDCT+MDST one can approximate a time shifted version in the freq domain for a low shift, though. But he needs to adjust the angle with respect to the frequency instead of the amplitude. Anyhow, I wouldn't encourage anyone to try that.

Let's not forget the amount of memory a decoder needs to memorize the last seen packets so it can restore the audio data due to the extra long term prediction.


Sebi

edit: grammar, typos