Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC v1.4.x Performance Tests (Read 222417 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: FLAC v1.4.x Performance Tests

Reply #525
Suggestion, if you don't mind running another round for the sake of the sport: Measure speed and size on the same corpus, for each of these runs on the "inexact" build, and post it here? 
flac -8per7 -A "subdivide_tukey(4);flattop"
flac -8er7 --lax -l15 -A "subdivide_tukey(4);flattop"
flac -8pr7 --lax -l15 -A "subdivide_tukey(4);flattop"
flac -8pr7 -A "subdivide_tukey(5);flattop"
I took "track 2" from each of my 38 and ran those 2x4 multiple times overnight, and -8pr7.
Sizes total to around 1 GB compressed. 
Table in size order first:
WAVE1950649336secondskiB saved over previous
Inexact, no "-p"/"-e"103251626768
Exact, no "-p"/"-e"1032260218141250
Inexact -p1031759702213489
Inexact -p "A5+"1031432685664319
Exact -p103140691760025
Inexact -pe "A4+"10312567363809147
Exact -p "A5+"10310021202023249
Exact -pe "A4+"103080333514817194
Inexact -el15 "A4+"103073984296162
Exact -el15 "A4+"10303449162391386
Inexact -pl15 "A4+"10301515831431189
Exact -pl15 "A4+"10297523222235390
Everything is run with -8r7 -j4 --no-padding in addition to the above, where "A4+" means -A "subdivide_tukey(4);flattop" and "A5+" bumps the 4 up to 5.
Times indicated are "Process time" (so clock times are about 1/4 of this) and a fishy matter on this throttling CPU, but I tried to run it hot first, and then they are median of four (because then it was morning and i got up).

Numbers indicate, starting out from the top:
* exact-rice appears to be "-p-alike" in cost/gains.
* exact-rice atop -p is much better than -pe, but we already knew that -pe is hopeless value for money, and I should likely have used less windowing functions on that one to keep numbers sane.
Anyway, you could argue then that an exact-rice option is better.
* For -l15, I am surprised over the -p vs -e timings, but it might be due to the order of runs and CPU throttling. Anyway it is not that far from consistent with what is going on with -l12 as found in standard -8.

Breaking it down on genres, classical music benefit much more. Like, twice as much as the rest. Two  files consistently benefit above 0.1 percent: Bruckner (that's vocal music) and Cage (percussion). Also Jordan Rudess (solo electric keyboard) benefit that much except for one run (-el15)

Re: FLAC v1.4.x Performance Tests

Reply #526
Or viewed with different glasses. Here I did not do any timing nor anything slower than -8pe.
But split into genre divisions:
total - classical - heavier - "other"

settingtotalclassicalheavierother|pct-pointswithbaseline:inexact -8
inex -753.0 %40.7 %70.7 %58.3 %|-.032 pts-.013 pts-.029 pts-.065 pts
ex -7.008 pts.012 pts.005 pts.007 pts|-.024 pts-.001 pts-.024 pts-.058 pts
inex -8.024 pts.001 pts.024 pts.058 pts|0 0 0 0
inex -8e.013 pts.018 pts.011 pts.011 pts|.013 pts.018 pts.011 pts.011 pts
ex -8.001 pts-.010 pts-.006 pts.024 pts|.015 pts.008 pts.005 pts.035 pts
ex -8e.017 pts.022 pts.014 pts.015 pts|.031 pts.030 pts.019 pts.049 pts
inex -8p.008 pts.003 pts.024 pts-.001 pts|.039 pts.033 pts.043 pts.048 pts
inex -8pe.016 pts.009 pts.006 pts.038 pts|.055 pts.042 pts.048 pts.086 pts
ex -8p.002 pts.013 pts.011 pts-.021 pts|.057 pts.055 pts.059 pts.065 pts
ex -8pe.019 pts.013 pts.008 pts.041 pts|.076 pts.067 pts.068 pts.106 pts
* Sorted by "total" size.
* Leftmost column: First compression ratio (lower is better) and then: how much, in percentage points, the setting gains on the one immediately over it. You see that exact -8 beats inexact -8e by 0.01 percentage points, but that is only due to the "other" (and I guess, Jordan Rudess ...) And that exact -8p narrowly beats inexact -8pe despite something going on in the other section.
* The rightmost panel are just cumulative percentage-points relative to inexact "-8", which is a natural "slowest reasonable" benchmark.

Seems that going "exact" is, ballpark:
* half as efficient as going "-7" to "-8".
* half as efficient as -p, when the baseline is -8 and not heavier.

Re: FLAC v1.4.x Performance Tests

Reply #527

flac -8per7 -A "subdivide_tukey(4);flattop"
flac -8er7 --lax -l15 -A "subdivide_tukey(4);flattop"
flac -8pr7 --lax -l15 -A "subdivide_tukey(4);flattop"
flac -8pr7 -A "subdivide_tukey(5);flattop"


It took almost two days..
ParamsAVG sizeAVG ratioAVG speed
-8 -per7 -A "subdivide_tukey(4);flattop" -j2826 168 05569,304%88,983x
-8 -er7 --lax -l15 -A "subdivide_tukey(4);flattop" -j2826 153 38669,265%306,023x
-8 -pr7 --lax -l15 -A "subdivide_tukey(4);flattop" -j2826 140 68869,231%212,109x
-8 -pr7 -A "subdivide_tukey(5);flattop" -j2826 170 40269,310%453,227x

Also i have found same sizes for different parameters:
Spoiler (click to show/hide)

Re: FLAC v1.4.x Performance Tests

Reply #528
Yeah it takes time - and this is all with the inexact-rice, right? Using exact-rice would take even more - if you have the patience to do that for comparison? You could of course skip the one with "-pe".
(What you see now is that -p with the extra windowing function so nearly catches -pe.)

The observation that some files have identical sizes, is not so strange, but I am a bit surprised which ones.
Not surprising: 34276. One of these allow for prediction order up to 15 instead of merely 12, but it turns out it doesn't find that 13/14/15 improves compression.
More surprising: All those where -e gives the same as a -p with extra windowing. The latter allows more attempts, while the former brute-forces the fewer - so it will then be that the it picks the right one by a cruder method (so that brute-forcing doesn't help) and that the extra windows are not "the right one".

Note, the only ones that have - potentially - extra layers of complexity in the file (for decoding), are the -l15. The others don't work by "packing heavier with more complexity" - they work by
* making more attempts, each at similar complexity, and then cherrypicking the best,
and/or:
* spending more time fine-calculating which one is actually the best.
The "exact-rice" build does even more of that.


Re: FLAC v1.4.x Performance Tests

Reply #529
Don't know if you will take the CPU time to do that experiment, but here is one reason that brute-forcing the Rice parameter won't make too big waves. Look at https://ntrs.nasa.gov/api/citations/20100011224/downloads/20100011224.pdf , the second text - and the diagram the second page.
The diagram shows the possible bounds for the best Rice parameter, given the mean of the (folded) residuals.
Example: if the average over the partition is slightly less than 11, you are around the tip of the arrow from the "Lower bound" text; lower bound there is 2, and upper bound is 4.
The difference upper bound minus lower bound will never exceed 2. And both the staircases are easy to compute.
If you choose according to the following rule:
* If the mean is < 2, choose 0 (i.e. unary coding!)
* Otherwise, stay 1 below the red staircase: from the mean µ, take its binary logarithm and round down to integer
- then that rule is mathematically guaranteed to produce a parameter that is no worse than "next to the best".

Meaning, if you choose that number, call it k: then given the mean µ, assumed to be >1 (otherwise k=0 is done deal):
* There is a chance you should have chosen k+1; that is, there is always some distribution with same mean µ such that k+1 would be better
* When the "roundoff error in k" is small (that is, the binary logarithm lb(µ) is "integer + very small positive"), there is a chance that you should have chosen k-1.

So if you want the best possible parameter for µ>1:
You always have to check "k+1".
When µ is between 4 and 5, you have to check k-1. Not for µ between 5 and 8=2^3. But between 8 and 11. Not between 11 and 16. But between 16 and 23. Not between 23 and 32. But between 32 and 47. Not between 47 and 64. But between 64 and 95. (If I got it right.)

I don't know how much the exact-rice build checks, and how much it speeds up by "refraining to check values that will surely never be optimal".


Numbers indicate, starting out from the top:
* exact-rice appears to be "-p-alike" in cost/gains.
That was a stretch. If these timings are any trustworthy, then
* Taking -8 as starting point, the best improvement per second is -p, saves 5 kB per second extra taken. That's fifty percent more than running exact-rice.
So the next choice to achieve better than -8, is -8p
* Thus, taking -8p as starting point ... the best would likely be what I didn't, say -A "subdivide_tukey(4)", one step up. But absent that:
* Exact-rice is, in terms of cost/gains, slightly better than going up to -A "subdivide_tukey(5);flattop": from -8p, it saves 0.9k/sec (vs 0.7). So marginal benefit per second just dropped quite a lot (as expected).
Say your next choice would be to forego the exact-rice build, because it is too inconvenient to have another switch. So you settled with 25 k bigger files. From then on, it is quite clear that the exact-rice build would be a better choice than -pe. But still the exact-rice won't save you more than approx 0.3k per extra second.

Bottom line: it is hardly worth it unless you are willing to go way past -8p.

Re: FLAC v1.4.x Performance Tests

Reply #530
This is an AVX2, gcc 14.2.0, Exact Rice compile of current git if somebody wants to test.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!


Re: FLAC v1.4.x Performance Tests

Reply #532
I can do tomorrow. As always compiling without asm optimization also plays a role.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #533
Here we have current git, AVX2, gcc 14.2.0, Exact Rice and default. Versions with disable-asm-optimizations included.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #534
No timing yet, but I pulled together some high resolution files to check if the exact Rice would fare any better or worse.

Differences at normal settings were < 0.005 percent, but I guess it is the heavy ones that are interesting, and then the savings are bigger here too. Ordered by size, the percentages are savings relative to "-8 as baseline", and relative to the one right above (that's a bit arbitrary, those numbers became small when you test many ...) Percent, not percentage points.

byteswrt. -8 inexwrt. previous
3398110588-8 inexact
33976367000.014%0.014%-8 exact
33974035990.021%0.007%-8r7 inexact
33968940470.036%0.015%-8r7 exact
33958987340.065%0.029%-8pr7 inexact
33952582920.084%0.019%-8pr7 exact, that saves 0.019 percent over inexact
33949868180.092%0.008%-8er7 inexact (-e beats -p only because hirez is weird)
33943603870.110%0.018%-8er7 exact
I'm not doing -pe. It is known that -e isn't much needed on CDDA, but high resolution signals are much more a mixed bag, so:
-e improves 0.071 (=0.092-0.021) to 0.074 percent, that's by comparing -8er7 over -8r7
-p improves some 0.044 or 0.048 percent (that's -8pr7 over -8r7)
-r7 improves some 0.021 to 0.022
The "exact" build can improve 0.014 to 0.019 percent.

Tiny numbers and likely very corpus-dependent. Here I took
* 26 min from Kimiko Ishizaka's free edition of Die Kunst der Fuge
* 27 min from HDTracks' 2022 sampler
* 25 min of four high resolution tracks from a couple of samplers from Aussie label Art As Catharsis (much post rock)
* 26 min from Cult of Luna: The Raging River
* 27 min from NIN: The Slip
* 26 min from Kayo Dot: Hubardo
* the full 18 minutes of The Tea Party: Tx20 EP (lots of intentional clipping on this one)
... and Anal Trump: That Makes Me Smart! The full 3 grindcore minutes: https://analtrump.bandcamp.com/album/that-makes-me-smart

Re: FLAC v1.4.x Performance Tests

Reply #535
@Porcus Yes it was inexact. Exact is in progress.
Spoiler (click to show/hide)

@Wombat thanx for your builds. They are really helpful.

Can you please tell me, why all of your builds add  extra 7 bytes to all the files encoded?
In. SizeOut. SizeCompr.SpeedParametersBinaryVersion
27 856 34527 849 77799,976%62,269-8 -epr8 -j16flac-win64-static-bed2e86b7cfb279b81a830595aa668d555b4ee57.exeflac 1.4.3
27 856 34527 842 03999,949%28,187-8 -epr8 -j16git-aa414e46-AVX2-gcc14.2.0_exact_rice_no_asm.exeflac git-aa414e46 20241229
27 856 34527 842 03999,949%28,632-8 -epr8 -j16git-aa414e46-AVX2-gcc14.2.0_exact_rice.exeflac git-aa414e46 20241229
27 856 34527 849 78499,976%76,647-8 -epr8 -j16git-aa414e46-AVX2-gcc14.2.0_no_asm.exeflac git-aa414e46 20241229
27 856 34527 849 78499,976%85,533-8 -epr8 -j16git-aa414e46-AVX2-gcc14.2.0.exeflac git-aa414e46 20241229

Re: FLAC v1.4.x Performance Tests

Reply #536
Must be the letters of the git naming that is longer as the simple 1.43.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #537
Yeah. If you want to kill absolutely all metadata - including the vendor naming string, but everything else too - then
metaflac --remove-all --dont-use-padding
will get you an apple to an apple.

Re: FLAC v1.4.x Performance Tests

Reply #538
@hat3k in case you didn't follow all flac related threads over here. The versions with disable-asm-optimizations only did better with 16bit material but loose big time with hires stuff. Good for cdda related apps.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #539
Looks like I haven't been posted in this thread for more than a year. If there is no significant change in the encoding algorithm, the effect of -p is usually quite consistent among different materials. On the other hand -e is kind of speculative unless you are familiar with the types of materials you are dealing with (e.g. fake hi-res, chiptune, test tones etc.)

Of course, for those who have already examined other settings like -b -r, -l,  -A or even -q, the playground is even bigger, but then the tweaks will become more and more content specific.

PS: to newcomers, I am not the OP of this thread, it was just a coincidence due to forum moderation, shortly after I found a bug in the CUETools.Flake encoder.

Re: FLAC v1.4.x Performance Tests

Reply #540
Here is "exact" encoder results.

ParamsAVG SizeAVG RatioAVG Speed
-8 -per7 -A "subdivide_tukey(4);flattop" -j2826 159 74269,282%25,594
-8 -er7 --lax -l15 -A "subdivide_tukey(4);flattop" -j2826 146 32269,246%145,857x
-8 -pr7 --lax -l15 -A "subdivide_tukey(4);flattop" -j2826 133 79469,213%150,357x
-8 -pr7 -A "subdivide_tukey(5);flattop" -j2826 162 84969,290%172,837x


Files with same sizes:
Spoiler (click to show/hide)


Here is "inexact" encoder results (from my previous post):

ParamsAVG sizeAVG ratioAVG speed
-8 -per7 -A "subdivide_tukey(4);flattop" -j2826 168 05569,304%88,983x
-8 -er7 --lax -l15 -A "subdivide_tukey(4);flattop" -j2826 153 38669,265%306,023x
-8 -pr7 --lax -l15 -A "subdivide_tukey(4);flattop" -j2826 140 68869,231%212,109x
-8 -pr7 -A "subdivide_tukey(5);flattop" -j2826 170 40269,310%453,227x


Files with same sizes:

Spoiler (click to show/hide)


The versions with disable-asm-optimizations only did better with 16bit material but loose big time with hires stuff. Good for cdda related apps.
Yes I've read it but thanks anyway and for the hint about version info in file. I will test your builds for sure.

Yeah. If you want to kill absolutely all metadata - including the vendor naming string, but everything else too - then
metaflac --remove-all --dont-use-padding
Nice idea. I will try to implement it in benchmark-H to to achieve more accurate comparisons of the output file sizes.

Re: FLAC v1.4.x Performance Tests

Reply #541
Yeah. If you want to kill absolutely all metadata - including the vendor naming string, but everything else too - then
metaflac --remove-all --dont-use-padding
will get you an apple to an apple.
I’m a bit confused. The build from Wombat and the build from Git differ by 7 bytes. We suspect this is due to differing lengths of the VENDOR strings. How can we ensure that re-encoded FLAC files have identical sizes? I’m asking because, during automated comparisons, even a 7-byte difference can matter and might become decisive when determining the "smallest file size." Removing all metadata doesn’t seem like a valid approach, in my opinion, as it doesn’t reflect the actual compression ratio. I’m unsure how to resolve this for now.

Re: FLAC v1.4.x Performance Tests

Reply #542
Normaly i suspect people use final versions to compress theire files. For personal use you may compile versions with the wrong vendor string of course.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #543
It is easy to compensate. Make a one-sample file and compress with two versions. Read off difference. Correct the figures of one of the versions in your spreadsheet-or-whatever. Or both (comparing to an "official" build.)

When comparing across codecs - or more precisely: across compressors with their file formats and features - then things become even more delicate.
* The reference flac executable will by default add padding. That has nothing to do with how well it compresses audio. But most of us want padding, to be able to do simple tag updates without writing the full thing. So to compare "real" .flac size with .wv size, there should be padding; WavPack doesn't need it, as the tags are at the end (and then it will just "write on at the end" if a tag change requires a bigger file).
And some formats can accept a tag at the beginning or end. Like MP3 ...
* Then flac does by default discard non-audio chunks - but can be forced to store them. ALAC and TTA have no provision to store those chunks. WavPack and TAK will store by default, but can discard. Monkey's-the-format doesn't require them, but as far as I understand, Monkey's-the-encoder will insist on keeping unless it is fed raw audio by pipe (I think OptimFROG does the same).
One file I downloaded from the artist: AIFF, with several MB picture - all black, but high resolution. The "file compressors" will store that (and not try to compress it!) - is that fair when comparing compression? You can argue that this is what you get, so yes; but then on the other hand, if that is what you want, you shouldn't compare WavPack with TTA at all, as TTA is not useful, so they should be compared at what they both can do (compress audio).
And, WavPack can --import-id3 ... that gives you double up of the tags. Sure yes if that is what you want.

Easiest is to take clean minimal-headered .wav. But even then, you "have to" take a stand on FLAC: using --no-padding to get its audio compression figures, or using its default operation? (Which better represents what the average user will (1) do, and (2) want.) Of course, eight kilobytes isn't much if your files are big (say one per CD image), and it is well-known already that the all-important priority is file size, then you don't go for FLAC.
At least not subset.