Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Multithreading (Read 80873 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: Multithreading

Reply #125
I don't really know.

It's the first time I deal with dlopen (used in the static OMP library).

At least the executable working properly :)

Re: Multithreading

Reply #126
  • Probably not worth it
  • --mode peakset --blocksize-list 576,1152,1728,2304,2880,3456,4032,4608 --analysis-comp 8p --output-comp 8ep --queue 8192 --tweak 1 --merge 0
@cid42 so I tried really really slow settings for fun on a 6 seconds file (I attached it).

Several times I get an "Error: Init failed" (this error is defined in common.c in flaccid) when using slow settings.
For example if I use this command:
Code: [Select]
flaccid --in 12_-_Napalm_Death_-_You_Suffer.flac --lax --out out.flac --preserve-flac-metadata --queue 8192 --workers 13 --tweak 1 --merge 0 --analysis-apod subdivide_tukey\(3\) --output-apod subdivide_tukey\(6\) --analysis-comp mepl32r15 --output-comp mepl32r15 --mode peakset --blocksize-list 256,512
Ideally I would use more blocksizes in the list (something like `seq -s, 256 256 5120`), but I shortened it to get the error faster (exact same output).
Here is the output:
Code: [Select]
(null)
Processed 1/3
Processed 3/3
Error: Init failed
With that same command, if I limit myself to fewer blocksizes (and bigger step) with `seq -s, 512 512 5120`, it works completely fine and output is created in 155 seconds (not too long if you want to test).

So I guessed flaccid could only work with blocksize >= 512, so I tried different settings (only removed `--queue` arg):
Code: [Select]
flaccid --in 12_-_Napalm_Death_-_You_Suffer.flac --lax --out out.flac --preserve-flac-metadata --workers 13 --tweak 1 --merge 0 --analysis-apod subdivide_tukey\(3\) --output-apod subdivide_tukey\(6\) --analysis-comp mepl32r15 --output-comp mepl32r15 --mode peakset --blocksize-list $(seq -s, 256 256 512)
Still has a blocksize of 256, and "Error: Init failed" but different output:
Code: [Select]
(null)
Processed 1/3
Processed 3/3
tweak(1) saved 20 bytes with 6 tweaks
# ~100 lines like this; then
Error: Init failed
Got this error after 73 seconds. Output FLAC file was almost done: 50ms shorter than input (only one frame is missing ?), MD5 not computed and 2 placeholders in the seektable.


You surely guessed it, I wrote this message in the hope that the error can be fixed.
I also have a question: why is it needed to have all the blocksizes (to bruteforce) multiples of the first one of the list ?
It is surely easier to implement and parallelize, but I find a waste of time when dealing with blocksize steps of 32, 64, 128 (since blocksize from 32 to ~1024 are completely useless to bruteforce).


Many thanks for your software and time.


Re: Multithreading

Reply #127
I'm seeing the same thing here.  I did some limited testing.  In combination with the --lax option,  I can reproduce if the blocksize list has any combination that can equal 256.  If the blocksize list has anything smaller or larger than 256, but doesn't equal 256, there's no error.







Re: Multithreading

Reply #128
You surely guessed it, I wrote this message in the hope that the error can be fixed.
Thank you for making it easy by providing the exact file and settings to reproduce, the latest commit should now work. It looks to be the small (4 sample) partial frame at the end that fails, it looks like it might be failing because the strong lax settings provided can't technically create valid output with such a small frame. I've added a catch-all to the init_static_encoder function that detects when initialisation fails, and tries again with less aggressive settings (presumably ./flac does something similar or the different way it handles the last frame means it doesn't encounter the problem). Not the most elegant solution but it should catch edge cases including this. I can't guarantee that all edge cases are now working, or even that there isn't an underlying problem yet to resolve as this catch-all may just be masking it.

I also have a question: why is it needed to have all the blocksizes (to bruteforce) multiples of the first one of the list ?
It means that regardless of the chosen blocks in a given chain, they always start and end on a discrete boundary that's a multiple of the smallest blocksize. Unconstrained bruteforce is not feasible and an intelligent algorithm that competes well has not been found, peakset is a reasonable compromise.

Re: Multithreading

Reply #129
Thanks for the quick update!

It indeed fixes the issue.

I built a static executable once again. Same specs as before: Linux x86-64, generic, static, PGO LTO optimized (O3 flto fprofile-instr-gen), stripped, UPX'd (compressed with -9 --ultra-brute).

Re: Multithreading

Reply #130
Built a static Linux 64-bit binary using libFLAC git-31ccd3df.   Targeted for AVX2 capable CPUs.  Also tried my hand at a static Win64 build.

Dear @Replica9000, if possible, consider building a Windows version without the need for AVX instructions.


• Join our efforts to make Helix MP3 encoder great again
• Opus complexity & qAAC dependence on Apple is an aberration from Vorbis & Musepack breakthroughs
• Let's pray that D. Bryant improve WavPack hybrid, C. Helmrich update FSLAC, M. van Beurden teach FLAC to handle non-audio data

Re: Multithreading

Reply #131
Built a static Linux 64-bit binary using libFLAC git-31ccd3df.   Targeted for AVX2 capable CPUs.  Also tried my hand at a static Win64 build.

Dear @Replica9000, if possible, consider building a Windows version without the need for AVX instructions.




My test setup is Win 7 x64 in a VM, so I'm not sure how well these perform on real hardware.  I included 32-bit and 64-bit, with an asm/noasm version of each.  On my Linux PC, the no asm builds perform faster for 16-bit files, but the opposite seemed to be the case in the VM.  Flaccid only supports 16-bit, but the couple 24-bit files I tried encoded fine.


Re: Multithreading

Reply #132
Flaccid only supports 16 bit wav/raw input as the wav formats are many and I never quite got the wav library working. But it should support arbitrary flac input (iirc), so a hack to encode any supportable wav would be pipe/convert to flac first. Not ideal but workable. Might revisit when flac gets a major update, for now just busy with other things.

Re: Multithreading

Reply #133
@Replica9000, unfortunately, binaries from your archive crash upon launch on my end.
Are you sure they do not need SSE4 or something else that I might not have?

• Join our efforts to make Helix MP3 encoder great again
• Opus complexity & qAAC dependence on Apple is an aberration from Vorbis & Musepack breakthroughs
• Let's pray that D. Bryant improve WavPack hybrid, C. Helmrich update FSLAC, M. van Beurden teach FLAC to handle non-audio data

Re: Multithreading

Reply #134
@cid42 The files I used were indeed 24-bit FLAC.  I read the github page a while back remembering something about 16-bit only.  I forgot that it applied to only wav/raw pcm.

....some brave soul can convert their entire collection if they wish. I recommend checking the result with flac -t before you delete the originals, this is still alpha you know ;)
A while back I started recompressing my library using more aggressive settings, updating tags and embedded album art.  About half way through I started using flaccid.  I've used it on thousands of tracks and no issues. :)



@Kraeved I had used a generic CPU flag.  I recompiled these binaries with the core2 flag.


Re: Multithreading

Reply #135
Thank you for the opportunity to get acquainted with Flaccid.

* hyperfine.exe --warmup 3 --runs 3 "commands"
* flac.exe x64 20240309, flake.exe x86 from CUETools 2.2.5
* in.wav, 44100 Hz 16 bit stereo, 99 423 788 bytes

Code: [Select]
  Command                                                           Mean time [s]        
 ---------------------------------------------------------------- ----------------
  flac -7 -f in.wav -o out.flac                                     8.716 ± 0.047       
  flaccid_w64 --preset 7 --in in.wav --out out.flac                 9.208 ± 0.176       
  flake -7 -f in.wav -o out.flac                                   10.496 ± 0.182      
  flaccid_w64_noasm --preset 7 --in in.wav --out out.flac          11.274 ± 0.041      
  flaccid_w32 --preset 7 --in ind.wav --out out.flac               16.057 ± 0.038      
  flaccid_w32_noasm --preset 7 --in ind.wav --out out.flac         17.986 ± 0.184 
• Join our efforts to make Helix MP3 encoder great again
• Opus complexity & qAAC dependence on Apple is an aberration from Vorbis & Musepack breakthroughs
• Let's pray that D. Bryant improve WavPack hybrid, C. Helmrich update FSLAC, M. van Beurden teach FLAC to handle non-audio data

Re: Multithreading

Reply #136
It looks like the binaries without asm optimizations are slower for you too.  Maybe it's something to do with Windows or the CPU flag used. 

On my system:
Code: [Select]
./flac -8p in.wav = 53.615s
./flac_noasm -8p in.wav = 44.648s
./flaccid --preset 8p --in in.wav --out out.wav = 53.606s
./flaccid_noasm --preset 8p --in in.wav --out out.wav = 44.261s

Edit:  Binaries compiled with core2 CPU flag.  Maybe no asm optimizations is only beneficial with AVX or better.
Code: [Select]
./flac_core2 -8p in.wav = 55.375s
./flac_core2_noasm -8p in.wav = 2m23.837s


 

Re: Multithreading

Reply #138
Quote
S:\_Test>"C:\Program Files\flac\flac.exe" -d "02 - Naked Love.flac" -c -s | "C:\Program Files\flac\flaccid_w64.exe"  --preset 8ep --workers 8 --in - --input-format wav --out test.flac
Error: Currently piping not supported for wav input
Can it be implemented, please? I want to use flaccid in foobar converter with piping.

Re: Multithreading

Reply #139
Quote
S:\_Test>"C:\Program Files\flac\flac.exe" -d "02 - Naked Love.flac" -c -s | "C:\Program Files\flac\flaccid_w64.exe"  --preset 8ep --workers 8 --in - --input-format wav --out test.flac
Error: Currently piping not supported for wav input
Can it be implemented, please? I want to use flaccid in foobar converter with piping.
Possibly. Here's the note I left for myself in the code:
Code: [Select]
	if(strcmp(path, "-")==0){
//drwav doesn't seem to have convenient FILE* functions
//so something might have to be done with raw memory
_("Currently piping not supported for wav input");//TODO
}
Wav support is flaky in general, even if piping gets added non-16 bit wav support needs work. I'll make some time in the next few days to give it another go, but no promises. Current me is only slightly smarter than past me, lets be honest probably slightly dumber.

If you're only dealing with CDDA then a workaround which should work with the current binary is to pipe the raw samples, --input-format cdda in flaccid, --force-raw-format from ./flac and whatever it is from foobar.

Alternatively, it's a hack but it works for all input ./flac supports, pipe from foobar to ./flac to ./flaccid, this way ./flac deals with the format mess for free and flaccid gets fed flac files which are fully supported. ./flac can use -0 which uses a rounding error of resources compared to flaccid's brute forcing.

(Could the flac pipe hack be built into flaccid to support wav input without drwav? libflac is already a dependency and it might be the path of least resistance)

Re: Multithreading

Reply #140
Alternatively, it's a hack but it works for all input ./flac supports, pipe from foobar to ./flac to ./flaccid, this way ./flac deals with the format mess for free and flaccid gets fed flac files which are fully supported. ./flac can use -0 which uses a rounding error of resources compared to flaccid's brute forcing.
Thanx a lot! It solved the problem.

Re: Multithreading

Reply #141
This works with Bash on Linux
Code: [Select]
flaccid --preset 8 --input-format cdda --in <(flac -c -s -d Foreward.flac) --out Foreward2.flac
This method doesn't preserve any tags though.


Re: Multithreading

Reply #143
Might revisit when flac gets a major update, for now just busy with other things.
Understandable. You still there?

(... hm,this thread has had misleading title since reply 12.)
Boo. I'm here right now but don't intend to work on this tool, it was just a hacky proof of concept to scratch an itch that I got a little too enthusiastic on. It might get updated to the latest libflac if I'm around long enough.

Re: Multithreading

Reply #144
As I just posted over at https://hydrogenaudio.org/index.php/topic,106129.msg1062688.html#msg1062688 , I ran flaccid against flacout, and it won big time without even digging into apodization functions more than "-8" already does. Limited corpus, since 76 minutes of audio took flacout 44 hours to process.

I gave a "suggestion" in that post:
flaccid as a re-compressor for flac files, where you frame by frame compare source to flaccid and if flaccid cannot improve the frame, then just copy it? (Sure you lose some bits in the frame header by committing to variable block size, but if we really want to nitpick on that, let's just compare the audio part of the final files.)
Doesn't have to take that much development work ... ?

Re: Multithreading

Reply #145
Ran a few on my full 38 CDs during the week-end to check for "not-outrageously-expensive" improvements over -7/-8 - if nothing else, to get an idea against the Exact Rice build, and on a different computer than my usual. Not rigorously timed.

TL;DR: flaccid beat -8p soundly. The following setting was only a little bit slower, and its 0.22 percent improvement over stock -8 is three times what -8p could show for itself.
--mode gasc --blocksize-limit-lower 2304 --tweak 64 --queue 16 --merge 0 --analysis-comp 6 --output-comp 8

(But, hrmph, flaccid chokes on non-ASCII filenames.)


So, what I ran - presented as kB per extra second to get you what fruits are lower-hanging. Stock 1.5.0 first:
-5 as "baseline": took 6 minutes, 12 032 megabytes.
-7: Took 143 more seconds, 11 974 megabytes. That means -7 did save 407 kB per extra second run, relative to -5.
-8: Relative to -7, it saved 41 kB per extra second run. So here the returns are diminishing quickly already.

Where to go from -8? The above flaccid in "gasc" mode.
15 kB per extra second over reference -8, against -8p's measly 6.


Also tried:
  • @Wombat's "Exact Rice" build. At -8 it saved 0.02 percent over the Xiph build. That did amount to a not-at-all-bad 13 kB per ekstra second
  • -8 -A "subdivide_tukey(4)", beefing that one up. About the same as the Exact Rice build, both in time and size (loses narrowly to it due to the classical music). I did not do that setting with the Exact Rice.
  • -8r7, -8pr7 in the Xiph build and the "Exact Rice" build: surprisingly little. Especially surprising especially on the heavier corpus where that fine partitioning is often used - but not so much saved from it. Hardly any difference at classical music. Overall only 3 kB/s saved at -8 (Xiph), and much less at -8p and at Exact Rice.
  • Any synergies between Exact Rice and -p? No. Even less than previous line.  Exact Rice's -8p is slow. 
  • Heavier flaccid: 40 percent more time than the one above, did improve but very little.  --mode peakset --blocksize-list 2304,4608 --analysis-comp 5 --output-comp 8r7  --queue 32 --tweak 32 --merge 0
  • Lighter flaccid: just a little bit faster than the gasc above, and not far worse, was --mode chunk --blocksize-list 2304,4608 --tweak 0 --merge 0 --analysis-comp 8 --output-comp 8r7



Re: Multithreading

Reply #147
As I just posted over at https://hydrogenaudio.org/index.php/topic,106129.msg1062688.html#msg1062688 , I ran flaccid against flacout, and it won big time without even digging into apodization functions more than "-8" already does. Limited corpus, since 76 minutes of audio took flacout 44 hours to process.

I gave a "suggestion" in that post:
flaccid as a re-compressor for flac files, where you frame by frame compare source to flaccid and if flaccid cannot improve the frame, then just copy it? (Sure you lose some bits in the frame header by committing to variable block size, but if we really want to nitpick on that, let's just compare the audio part of the final files.)
Doesn't have to take that much development work ... ?

You're suggesting a recompressor that works within the boundaries of the frames in the source? So a 4096 fixed frame could be sliced into arbitrarily small frames or combined with adjacent frames, but if none of that is an improvement just output the (slightly modified) original frame. I mean that could work, but I don't see a whole lot to gain. Unless the source used very strong settings that flaccid isn't, the source frame is unlikely to be smaller. Unless the point is to not recalculate the 4096 frame size at all, which would save a little computation. Flac is a simple enough format that with libflac doing the grunt work you could knock out this sort of thing in a day, converting a fixed frame to a variable frame is a mild pain as you need to redo the frame number and crc's but it's not hard.

flaccid chokes on non-ascii filenames, by that do you mean some windows utf16 nonsense? That wouldn't surprise me. Can't say I've tried utf8 on Linux, that has a better chance of working but it might not.

Re: Multithreading

Reply #148
Unless the source used very strong settings that flaccid isn't, the source frame is unlikely to be smaller.
OK! But yeah I was thinking that ...


flaccid chokes on non-ascii filenames, by that do you mean some windows utf16 nonsense?
The µ character of ISO8859-1 / -15  is bad enough. But maybe it is the Windows codepage.

Re: Multithreading

Reply #149
It's probably codepage stuff, flaccid just uses fopen so this applies: https://stackoverflow.com/questions/396567/is-there-a-standard-way-to-do-an-fopen-with-a-unicode-string-file-path

MS trying to give C interoperability the old yeller treatment decades ago really was a dick move, all this _wfopen nonsense and a million other examples. Zero respect MS, zero. The day windows uses the Linux kernel as the main kernel and an NT kernel as the one they bolt on with LSW for compatibility, that will be a good day.