On 2017-04-06 18:06, James Almer wrote:
> Your numbers are really confusing. Could you post the actual numbers for
> each function instead of doing comparisons?

These figures are the actual numbers!

Using the figures from Haswell above:
> ff_h264_idct_add_8_mmx  = 52 cycles
> ff_h264_idct_add_8_sse2 = 49 cycles
> ff_h264_idct_add_8_avx  = 46 cycles

Coming back to this draft I saved I removed a fair bit of ranting and
cut it down to the essential point.

Also, I forgot about the Pentium I tested previous patches on.  I added
SSE2.  From that commit message:
> Kaby Lake Pentium:
>  - ff_h264_idct_add_8_sse2:    ~1.18x faster than mmxext
>  - ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to