[FFmpeg-devel] [PATCH 0/6] More H.264 assembly (the sequel)

James Darnley Thu, 01 Dec 2016 09:00:22 -0800

Some more assembly for review.  This time we have 10-bit h chroma functions.


The intra ones have some strange benchmark results.  Overall the improvement
isn't that large, particularly for the 4:2:0 intra.  And for the avx version of
that function it is slower than the sse2, by quite a margin.  I will definitely
try benchmarking it on my Nehalem after sending these emails.

Suggestions greatly appreciated.

James Darnley (6):
  avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter
  avcodec/h264: clean up and expand x86 function definitions
  whitespace changes after last commit
  avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop
    filter
  avcodec/h264: mmx2, sse2, avx 10-bit h chroma intra deblock/loop
    filter
  avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma intra deblock/loop
    filter

 libavcodec/x86/h264_deblock_10bit.asm | 213 ++++++++++++++++++++++++++++++++++
 libavcodec/x86/h264dsp_init.c         |  74 ++++++++----
 2 files changed, 262 insertions(+), 25 deletions(-)

-- 
2.10.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 0/6] More H.264 assembly (the sequel)

Reply via email to