Lynne: > Sep 25, 2022, 14:34 by andreas.rheinha...@outlook.com: > >> Lynne: >> >>> Sep 24, 2022, 23:57 by d...@lynne.ee: >>> >>>> Sep 24, 2022, 21:40 by mar...@martin.st: >>>> >>>>> What about ac3dsp then - that one seems like it's fairly optimized for >>>>> arm? >>>>> >>>> Haven't touched them, they're still being used. Unfortunately, for AC3, >>>> the full MDCT optimizations in lavc do make a difference and the overall >>>> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's >>>> asm disabled and 7% slower with lavu/tx's asm enabled. I do plan to write >>>> an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster, >>>> which should make the decoder at least 10% faster with lavu/tx. >>>> >>> >>> I'd just like to add this was for the float version of the ac3 decoder. The >>> fixed-point >>> version is a few percent faster with the patch on an A53, and quite a bit >>> more accurate. >>> The lavc fixed-point FFT code also has some weird large spikes in #cycles >>> for some transform sizes, so the figure above is an average, but the dips >>> went from 117x realtime to 78x realtime, which on a slower CPU may >>> be the difference between stuttering and realtime playback. >>> On this CPU, the fixed-point version is 23% slower than the float version, >>> but on a CPU with slower float ops, it would make more sense to pick that >>> decoder up than the float version. >>> The 2 decoders produce nearly identical results, minus a few rounding >>> errors, since AC3 is inherently a fixed-point codec. The only difference >>> are the transforms themselves, and the extra ops needed to convert >>> the 25bit ints to floats in the float decoder. >>> >> >> 1. You forgot to remove mdct15 requirements from configure in this whole >> patchset. >> 2. You forgot to update the FATE references for several tests; e.g. when >> only applying the ac3 patch, then I get this: >> > > I know. durandal pointed it out the day I sent them. I'll send them again > later. > I'm planning to just push the Opus patch in a day with the mdct15 > line in configure gone. > > >> As the above shows, the difference between the reference files and the >> decoded output becomes larger in several tests, i.e. the reference files >> won't be usable lateron. If the new float and fixed-point decoders >> produce indeed produce nearly identical output, then one could write >> tests that decode the same file with both the floating point and the >> fixed point decoder, check that both are nearly identical and print a >> checksum of the output of the fixed point decoder. >> > > I have a standalone program I've hacked on as I need to for the fixed-point > transforms: https://0x0.st/oWxO.c > The square root of the squared rounding error across the entire range > (1 to 21 bits) of transforms from 32pt to 1024pt is 6.855655 for lavu and > 7.141428 for lavc, which is slightly worse. If you extend the range > to 22bits, the 1024pt transform in lavc explodes, while lavu is still fine, > thus showing a greater range. > The rounding errors are a lesser problem than hitting the max range, > because then you get huge spikes in the output. > I can further reduce the error in lavu at the cost of speed, but I think > this is sufficient. > > >> Also note that there is currently no test that directly verifies your >> claims of greater accuracy. One could write such a test by encoding a >> file with ac3-fixed and decoding it again (with the fixed point decoder) >> and printing the psnr of input and output. No encoding tests does this >> at the moment. >> > > I'm not writing that, but I like the idea, the point of fixed-point decoders > isn't bitexactness, but speed on slow hardware, so we shouldn't be testing > an MD5.
Are your fixed-point transforms bitexact across all arches/cpuflags? - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".