Hello, new version in attach for simd optimization of reorder_pixels (use by rle and zip uncompress)
pass fate test for me (on Mac Os X) Tested with the decoding of a sequence of 150 HD Exr images (CGI render with 17 layers per file in float pixel, ZIP16 compression) AVX2, seems to provide only few speed improvment (if someone have an idea, about how to improve) The results : Scalar : 2734448 decicycles in reorder_pixels_zip, 130476 runs, 596 skips bench: utime=121.045s bench: maxrss=608714752kB SSE : 282900 decicycles in reorder_pixels_zip, 130935 runs, 137 skips bench: utime=107.310s bench: maxrss=615378944kB AVX2 : 247404 decicycles in reorder_pixels_zip, 130894 runs, 178 skips bench: utime=107.182s bench: maxrss=615391232kB The overread is 1x mmsize (16 bytes in SSE, 32 in AVX2) The overwrite is 2x mmsize (32 bytes in SSE, 64 in AVX2) Comments Welcome Martin
0001-libavcodec-exr-add-SIMD-reorder_pixels-for-SSE2-and-.patch
Description: Binary data
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel