I'm trying to convert floating point audio data from jack to integer audio data for FLAC.
I'm getting sound, but it's distorted somehow. i have a flac encoding example: out.flac (right-click -> save as) (http://www-personal.umich.edu/~burkemw/out.flac)
does it sound like there has been an effect applied to the sound, maybe two? what are they? and if anyone has suggestions on my code let me know.
i've been trying different algorithms (in C), but they have been a variation of the following:
float inSample;
FLAC__int32 outSample;
if (inSample > 1.0) { inSample = 1.0; }
else if (inSample < -1.0) { inSample = -1.0; }
if (inSample >= 0) {
outSample = (FLAC__int32) lrintf(inSample * 32767.0);
} else {
outSample = (FLAC__int32) lrintf(inSample * 32768.0);
}
I doubt that the difference will be audible, but you should not use two different scale factors for positive and negative values. There may, however, be a problem with the difference in how positive and negative values are truncated to integer. You might try adding 1.0 to the values, multiply by 32767.0, convert to unsigned integer, then subtract 32767.
yeah, i've played with the numbers a lot, but it's just a scale factor, essentially volume.
your algorithm certainly changes the sound, but it's still bad.
The flac left channel is muted 2000 samples out of wach 4000 samples. Certainly displayed codelines don't cause that
It sound's like your assumption about the samples' interleaving is false.
You get a block of N samples x[0..N-1] where the first N/2 samples correspond to the left channel and the 2nd half to the right channel. However, you interpret these as channel-interleaved samples ( x[even index] for left channel and x[odd index] for the right channel ). This would explain the pitch shift upwards by one octave and the noticable blocking artefacts.
regarding float->int conversion:
- Don't use two different scale factors. It's supposed to be 32768
- Read up on what's dithering for and how it's done
Cheers!
SG
hey guys, thanks for your help.
my assumptions regarding the interleaving weren't wrong, but my implementation of the interleaving certainly was. I was in the process of posting my interleaving code and realized my mistake. wouldn't have been able to do it without you though.
for others happening across this i am using the following clip/scale block:
if (inSample > 1.0) { inSample = 1.0; }
else if (inSample < -1.0) { inSample = -1.0; }
outSample = (FLAC__int32) lrintf(inSample * 32767.0);
I recently had to decide how to do this exact conversion because I needed to write something up in the WavPack documentation, and so I came up with methods I like the best:
To convert from float to int:
if (float >= 1.0)
int = 32767;
else if (float <= -1.0)
int = -32768;
else
int = floor (float * 32768.0);
The reasons I like this are (compared to some other methods):
- the floating range of ± 1.0 is evenly distributed to the full range of output values
- the output sign is always equal to the input sign
- zero in gives zero out
- the scaling is an easy multiply (as opposed to 32767.0)
- no discontinuity at zero
To convert the other way, simply use:
float = int / 32768.0;
Again:
- the output sign is always equal to the input sign
- zero in gives zero out
- the scaling is an easy divide (as opposed to 32767.0)
- no discontinuity at zero
This never gives +1.0, but I don't think that's important because float data is not supposed to be clipped anyway (that's one of its advantages). Only when audio is in the integer domain does it need to be clipped.
The other thing I like about these two methods is that if you convert integer to float back to integer you always get the same value back (assuming there's enough resolution, of course).
Also, SebastianG is correct that you really should dither (and maybe noise shape) if you are concerned about quality.
Interesting! ...May I butt-in and ask a question?
Where does the scale factor come from? Is that from the WAV spec?
I've heard of IEEE 32-bit WAV files, and I've heard that many audio editors use 32-bit floating point values/operations. But, I have not heard about scaling.... I had sort-of assumed that 32767 would simply be converted to 32767.0 (allowing for more resolution & headroom).
I've heard of IEEE 32-bit WAV files, and I've heard that many audio editors use 32-bit floating point values/operations. But, I have not heard about scaling.... I had sort-of assumed that 32767 would simply be converted to 32767.0 (allowing for more resolution & headroom).
Having the floating-point values directly converted from integers would work fine, and that's how CoolEdit works internally (and generates these weird incompatible formats). However, Microsoft (or someone before them) decided to use the range of +/- 1.0 for normalized 32-bit IEEE float audio in WAV files. In a way this makes more sense than some arbitrary range like +/- 32768.0 (because the range with 24-bit audio would be different) and 32-bit floats offer so much dynamic range that it really doesn't matter in a practical sense where you put the normalized values.
Thank's, bryant. Now, if I can just get this Hello World program working...