This function is tested by fate-filter-fps-r. I have also added a checkasm test and bench.
I have done a lot more testing and benching of this code and I am now happy to activate the avx2 version because the performance is so good. On my machine I get the following results for filter size 4 and 0 offset. For all other sizes/offsets the results are similar: yuv2yuvX_4_0_mmx: 1567.2 1563.1 yuv2yuvX_4_0_mmxext: 1560.7 1560.1 yuv2yuvX_4_0_sse3: 780.7 572.1 -26.7% yuv2yuvX_4_0_avx2: n/a 341.1 -56.3% Interestingly I discovered that the non-temporal store movntdq results in a very large variability in the test results, in many cases it significantly increases the execution time. I have replaced these stores with aligned stores which stabilised the runtimes. However, I am aware that benchmarks often don't represent reality and these non-temporal stores were probably used for a good reason. If you think it better to use NT stores, I will replace them. On Fri, Dec 4, 2020 at 2:00 PM Anton Khirnov <an...@khirnov.net> wrote: > Quoting Alan Kelly (2020-11-19 09:41:56) > > --- > > All of Henrik's suggestions have been implemented. Additionally, > > m3 and m6 are permuted in avx2 before storing to ensure bit by bit > > identical results in avx2. > > libswscale/x86/Makefile | 1 + > > libswscale/x86/swscale.c | 75 +++-------------------- > > libswscale/x86/yuv2yuvX.asm | 118 ++++++++++++++++++++++++++++++++++++ > > 3 files changed, 129 insertions(+), 65 deletions(-) > > create mode 100644 libswscale/x86/yuv2yuvX.asm > > Is this function tested by FATE? > I did some brief testing and apparently it gets called during > fate-filter-shuffleplanes-dup-luma, but the results do not change even > if I comment out the whole function. > > Also, it seems like you are adding an AVX2 version of the function, but > I don't see it being used. > > -- > Anton Khirnov > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".