On Sun, Sep 14, 2014 at 07:35:26PM -0300, James Almer wrote: > On 14/09/14 7:12 PM, Michael Niedermayer wrote: > > On Sat, Sep 13, 2014 at 10:12:12PM -0300, James Almer wrote: > >> Also add a missing c->pix_abs[0][0] initialization, and sse2 versions of > >> sad16_x2, sad16_y2 and sad16_xy2. > >> Since the _xy2 versions are not bitexact, they are accordingly marked as > >> approximate. > >> > >> Signed-off-by: James Almer <jamr...@gmail.com> > >> --- > > > >> Not benched. > > > > if the author of some code doesnt benchmark his code, how can he know > > which way it is faster ? > > what effect each difference has ? ... > > I didn't bench because i didn't have the time and assumed it wasn't necessary > considering this is a port from inline to yasm with little to no changes to > the asm. > I'll try to do some quick benchmarks later.
[...] > > > > > >> +%if mmsize == 16 > >> + movhlps m0, m2 > >> + paddw m2, m0 > >> +%endif > >> + movd eax, m2 > >> + RET > >> +%endmacro > >> + > >> +INIT_MMX mmxext > >> +SAD 8 > >> +SAD 16 > >> +INIT_XMM sse2 > >> +SAD 16 > >> + > >> +;------------------------------------------------------------------------------------------ > >> +;int ff_sad_x2_<opt>(MpegEncContext *v, uint8_t *pix1, uint8_t *pix2, int > >> stride, int h); > >> +;------------------------------------------------------------------------------------------ > >> +%macro SAD_X2 1 > >> +cglobal sad%1_x2, 5, 5, 5, v, pix1, pix2, stride, h > >> +%if %1 == mmsize > >> + shr hd, 1 > >> +%define STRIDE strideq > >> +%else > >> +%define STRIDE 8 > >> +%endif > >> + pxor m0, m0 > >> + > > > >> +align 16 > > > > do these improve or reduce the speed ? > > No idea. I copied them from the inline version (where they were ".p2align 4") > to keep the resulting asm as similar as possible. ahh ok, ive not realized that if its just the same as before then its ok [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel