Hello, After taking a look on blockdsp ./tests/checkasm/checkasm --test=blockdsp --bench
the result of clear_blocks is slower on my computer than the C version except if we add an avx version In attach patch to add avx version for clear_block and clear_blocks result : (Kaby Lake, Mac os 10.12) checkasm: all 6 tests passed blockdsp.clear_block_c: 15.9 blockdsp.clear_block_mmx: 16.4 blockdsp.clear_block_sse: 7.4 blockdsp.clear_block_avx: 3.9 blockdsp.clear_blocks_c: 29.6 blockdsp.clear_blocks_mmx: 99.1 blockdsp.clear_blocks_sse: 48.4 blockdsp.clear_blocks_avx: 24.4 I also modify several decoder/encoder, in order to fix the DECLARE_ALIGNED from 16 to 32 I run make fate SAMPLES=fate-suite/ i have several errors, but after a check, these errors doesn't seems to be related to this patch Martin
0001-libavcodec-blockdsp-add-AVX-version.patch
Description: Binary data
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel