Sep 24, 2022, 23:57 by d...@lynne.ee:

> Sep 24, 2022, 21:40 by mar...@martin.st:
>
>> What about ac3dsp then - that one seems like it's fairly optimized for arm?
>>
> Haven't touched them, they're still being used. Unfortunately, for AC3,
> the full MDCT optimizations in lavc do make a difference and the overall
> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's
> asm disabled and 7% slower with lavu/tx's asm enabled. I do plan to write
> an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster,
> which should make the decoder at least 10% faster with lavu/tx.
>

I'd just like to add this was for the float version of the ac3 decoder. The 
fixed-point
version is a few percent faster with the patch on an A53, and quite a bit
more accurate.
The lavc fixed-point FFT code also has some weird large spikes in #cycles
for some transform sizes, so the figure above is an average, but the dips
went from 117x realtime to 78x realtime, which on a slower CPU may
be the difference between stuttering and realtime playback.
On this CPU, the fixed-point version is 23% slower than the float version,
but on a CPU with slower float ops, it would make more sense to pick that
decoder up than the float version.
The 2 decoders produce nearly identical results, minus a few rounding
errors, since AC3 is inherently a fixed-point codec. The only difference
are the transforms themselves, and the extra ops needed to convert
the 25bit ints to floats in the float decoder.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to