On 08/02/15 8:21 AM, Christophe Gisquet wrote: > Hi, > > 2015-02-07 23:06 GMT+01:00 James Almer <jamr...@gmail.com>: >> Signed-off-by: James Almer <jamr...@gmail.com> >> --- >> libavcodec/x86/hevc_sao.asm | 40 ++++++++++++++++++++++++++++++++++++---- >> libavcodec/x86/hevcdsp_init.c | 24 ++++++++++++------------ >> 2 files changed, 48 insertions(+), 16 deletions(-) > > Passes fate here for Win32/64, but that's probably not the platforms > you interested in hearing feedback from. > Looks OK to commit. > > A few questions though: > >> + %assign MMSIZE mmsize > > Why do that? Not a big deal: it's only for my education, if there's > something I'm missing.
For width 48, the COMPUTE macro is last run after an INIT_XMM cpuname, so mmsize becomes 16 and in the avx2 version the instructions would access the wrong data in stack. Doing %assign MMSIZE mmsize at the beginning of the function and using it here makes sure it's always 32 in avx2. sse2 is unaffected by this, of course. And the reason I'm using INIT_XMM in the middle of the function for the avx2 width 48 case is because i couldn't find a nice and clean way to use the xm* reg aliases with the COMPUTE macros. > >> + pcmpeqw m4, %2, [rsp+MMSIZE*0] >> + pcmpeqw m5, %2, [rsp+MMSIZE*1] >> + pcmpeqw m6, %2, [rsp+MMSIZE*2] >> + pcmpeqw %2, [rsp+MMSIZE*3] >> + pand m4, [rsp+MMSIZE*4] >> + pand m5, [rsp+MMSIZE*5] >> + pand m6, [rsp+MMSIZE*6] > [...] >> -cglobal hevc_sao_band_filter_%1_8, 6, 6, 15, dst, src, dststride, >> srcstride, offset, left >> +cglobal hevc_sao_band_filter_%1_8, 6, 6, 15, 8*mmsize*ARCH_X86_32, dst, >> src, dststride, srcstride, offset, left >> HEVC_SAO_BAND_FILTER_INIT 8 > > Why do you need room for 8 regs, and not 7? Remnant from before i realized i could keep m7 untouched. I'll change it. > > Setting this to 7 and doing the required changes to remove MMSIZE pass > here on Win32. Again, MMSIZE is for avx2 and you're probably running the sse2 version, where it's always the same as mmsize. > >> - SAO_BAND_INIT(10, sse2); >> SAO_EDGE_INIT(10, sse2); >> } >> + SAO_BAND_INIT(10, sse2); > > On a side note, you haven't ported >8 bits edge filter to x86_32. I > guess because it wouldn't play real time anyway? I was going to give edge >8bit a try next. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel