Sep 24, 2022, 23:57 by d...@lynne.ee: > Sep 24, 2022, 21:40 by mar...@martin.st: > >> What about ac3dsp then - that one seems like it's fairly optimized for arm? >> > Haven't touched them, they're still being used. Unfortunately, for AC3, > the full MDCT optimizations in lavc do make a difference and the overall > decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's > asm disabled and 7% slower with lavu/tx's asm enabled. I do plan to write > an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster, > which should make the decoder at least 10% faster with lavu/tx. >
I'd just like to add this was for the float version of the ac3 decoder. The fixed-point version is a few percent faster with the patch on an A53, and quite a bit more accurate. The lavc fixed-point FFT code also has some weird large spikes in #cycles for some transform sizes, so the figure above is an average, but the dips went from 117x realtime to 78x realtime, which on a slower CPU may be the difference between stuttering and realtime playback. On this CPU, the fixed-point version is 23% slower than the float version, but on a CPU with slower float ops, it would make more sense to pick that decoder up than the float version. The 2 decoders produce nearly identical results, minus a few rounding errors, since AC3 is inherently a fixed-point codec. The only difference are the transforms themselves, and the extra ops needed to convert the 25bit ints to floats in the float decoder. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".