On 08/02/15 3:41 PM, Christophe Gisquet wrote: > Hi, > > 2015-02-08 18:48 GMT+01:00 James Almer <jamr...@gmail.com>: >>>> + %assign MMSIZE mmsize >>> >>> Why do that? Not a big deal: it's only for my education, if there's >>> something I'm missing. >> >> For width 48, the COMPUTE macro is last run after an INIT_XMM cpuname, so >> mmsize becomes >> 16 and in the avx2 version the instructions would access the wrong data in >> stack. >> Doing %assign MMSIZE mmsize at the beginning of the function and using it >> here makes sure >> it's always 32 in avx2. >> sse2 is unaffected by this, of course. >> >> And the reason I'm using INIT_XMM in the middle of the function for the avx2 >> width 48 case >> is because i couldn't find a nice and clean way to use the xm* reg aliases >> with the COMPUTE >> macros. > > OK. Strange that it still compiled fine on Win32 here, but I haven't > looked at the code generated, and I can't run the avx2 version. > >>>> +cglobal hevc_sao_band_filter_%1_8, 6, 6, 15, 8*mmsize*ARCH_X86_32, dst, >>>> src, dststride, srcstride, offset, left >>>> HEVC_SAO_BAND_FILTER_INIT 8 >>> >>> Why do you need room for 8 regs, and not 7? >> >> Remnant from before i realized i could keep m7 untouched. I'll change it. > > If that's the only change, then, unless someone complains, just push > that version.
Pushed, thanks. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel