On 02/08/14 3:20 PM, Clément Bœsch wrote: > + psrlq m0, m6, 32 > + paddw m6, m0 > + psrlq m0, m6, 16 > + paddw m6, m0 > + movd eax, m6 > + movzx eax, ax
You could use the HADDW macro here. > +;------------------------------------------------------------------------------- > +; int ff_pixelutils_sad_8x8_mmxext(const uint8_t *src1, ptrdiff_t stride1, > +; const uint8_t *src2, ptrdiff_t stride2); > +;------------------------------------------------------------------------------- > +INIT_MMX mmxext > +cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2 > + pxor m2, m2 > +%rep 4 > + mova m0, [src1q] > + mova m1, [src1q + stride1q] > + psadbw m0, [src2q] > + psadbw m1, [src2q + stride2q] > + paddw m2, m0 > + paddw m2, m1 > + lea src1q, [src1q + 2*stride1q] > + lea src2q, [src2q + 2*stride2q] > +%endrep > + movd eax, m2 > + RET Adding sad16x16 mmxext should be a matter of using add instead of lea, changing the %rep amount, and using 8 instead of stride[12]q for the mova and psadbw. > --- /dev/null > +++ b/libavutil/x86/pixelutils.h > @@ -0,0 +1,26 @@ > +/* > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 > USA > + */ > + > +#ifndef AVUTIL_X86_PIXELUTILS_H > +#define AVUTIL_X86_PIXELUTILS_H > + > +#include "libavutil/pixelutils.h" > + > +void ff_pixelutils_init_x86(AVPixelUtils *s); This prototype should be in libavutil/pixelutils.h No need to make a whole new header just for it. Maybe you could add a quick test for these functions? Look at lavc/motion-test.c and lavu/float-dsp.c _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel