Re: [FFmpeg-devel] [PATCH] x86/hevc_mc: optimize AVX2 mc functions

2015-02-12 Thread James Almer
On 12/02/15 7:47 AM, Christophe Gisquet wrote: > Hi, > > 2015-02-12 7:29 GMT+01:00 James Almer : >> Before >> 40766 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips >> >> After >> 37975 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips > > Looks straightforward.

Re: [FFmpeg-devel] [PATCH] x86/hevc_mc: optimize AVX2 mc functions

2015-02-12 Thread Christophe Gisquet
2015-02-12 11:47 GMT+01:00 Christophe Gisquet : > Looks straightforward. But now I understand why we declare using 11 > xmm regs in some places, which impacts a patch that has been reviewed > and needs updating. A patch of mine for x86_32. Just ignore me, I'm speaking to myself -- Christophe ___

Re: [FFmpeg-devel] [PATCH] x86/hevc_mc: optimize AVX2 mc functions

2015-02-12 Thread Christophe Gisquet
Hi, 2015-02-12 7:29 GMT+01:00 James Almer : > Before > 40766 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips > > After > 37975 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips Looks straightforward. But now I understand why we declare using 11 xmm regs in some

[FFmpeg-devel] [PATCH] x86/hevc_mc: optimize AVX2 mc functions

2015-02-11 Thread James Almer
Before 40766 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips After 37975 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips Signed-off-by: James Almer --- libavcodec/x86/hevc_mc.asm | 32 1 file changed, 12 insertions(+), 20 de