MP3 Bytes per frame?
Reply #5 – 2005-03-03 08:37:42
It is correct: the side info area of the frame contains the number of bits used for audio data (called "part2_3 length" in the reference decoder). So the amount of bits used by the frame is the simple addition of all the part3_2 lengths? This is the crux of my question...I suggest that you have a look at the sources of a working decoder to find out all the details. This is usually more helpful than just reading some explanations, at least it is for me. I've been brooding over LAME's bitstream.c to try to figure out the structure of a frame, but I'm kind of stuck with this problem.If you would allow the use of the bit reservoir then the sizes of the audio data could be smaller, equal or larger than the size of a single frame. If you force the encoder to not use the bit reservoir (using -nores option) then the size of the encoded audio data is always less or equal the maximum space inside a single frame. How many bits are actually used depends on the data itself and on the encoder's quality settings. Yup, which is why this will probably do nothing for VBR and low-bitrate CBR files. However, for high-bitrate CBR files, there may be stretches of data which are incapable of filling up the frame. If this occurs for long enough, part of the free space may be pushed past the 511 byte limit for the reservoir, rendering it inaccessible. (If that makes any sense... ) Plus, the bit reservoir is completely useless for CBR 320 files, as the maximum amount of data must not be more than can fit in a 320kbps frame. To check which files would benifit the most from this, I figured out how much space was just padding. In an APS file, the savings would only be 0.08%. In a CBR 128 file, the savings would be 0.1%. However, for API you might be able to save 10% due to the wasted space.Yes, that's right. The area beyond the audio data called "auxiliary data", not "junk". It's only there to fill up the fixed size of the frame. I haven't ruled out the possibility of it being simple padding, but it doesn't really look like padding. Most LAME padding says stuff like "LAME 3.96.1UUUUUUU...", or is just a bunch of nulls. However, for one file, the extra data is:31 05 35 14 CC B8 E4 DC 80 A1 85 B1 C1 A1 84 A4 00 00 00 00 00 00 00 00 ... (*) which doesn't spell anything, and doesn't have any nice binary properties (like "U") It looks like it actually means something, but I can't figure out what.Btw. padding as mentioned by Jud varies the frame sizes only by one byte, in order to match the bitrate setting for the stream. IMHO that is not related to your question. Yup. I figured out how to parse the frame header, and how to detect padding, CRC, and all the other tidbits about the frame. The problem now is the data within the frame...