Posted an unrolled version in a new thread, alongside a few patches by 
Christophe.

On 30/01/15 3:50 PM, James Almer wrote:
> Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard 
> Lepere.
> 10/12bit yasm ports, refactoring and optimizations by James Almer
> 
> Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
> 
> width 32
> 40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
> 8585 decicycles in ff_hevc_sao_band_filter_8_sse2, 2048 runs, 0 skips
> 4543 decicycles in ff_hevc_sao_band_filter_8_avx2, 2048 runs, 0 skips
> 
> width 64
> 136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
> 29366 decicycles in ff_hevc_sao_band_filter_8_sse2, 16384 runs, 0 skips
> 15357 decicycles in ff_hevc_sao_band_filter_8_avx2, 16383 runs, 1 skips
> 
> Signed-off-by: James Almer <jamr...@gmail.com>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to