Hi, 2015-01-30 19:50 GMT+01:00 James Almer <jamr...@gmail.com>: > +%macro HEVC_SAO_BAND_FILTER_COMPUTE 3 > + psraw %2, %3, %1-5 > + pcmpeqw m10, %2, m0 > + pcmpeqw m11, %2, m1 > + pcmpeqw m12, %2, m2 > + pcmpeqw %2, m3 > + pand m10, m4 > + pand m11, m5 > + pand m12, m6 > + pand %2, m7 > + por m10, m11 > + por m12, %2 > + por m10, m12 > + paddw %3, m10 > +%endmacro
The shift does really force to work on bytes, too bad. Some pshufb might still be possible using the result, but it would be cumbersome because the psraw result is [0-31], and offset might be signed. > +.loop: > + movu m13, [srcq+widthq] [...] > + movu [dstq+widthq], m8 Some of those moves could be aligned, but there's some work to be done at the buffer levels. So it's not like it's really part of this patch. Looks good, any improvement seems like an additional patch. -- Christophe _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel