On 2016-01-14 20:21, Henrik Gramner wrote: > On Wed, Jan 13, 2016 at 4:55 PM, James Darnley <james.darn...@gmail.com> > wrote: >> diff --git a/libavcodec/x86/v210enc.asm b/libavcodec/x86/v210enc.asm >> index 859e2d9..a8f3d3c 100644 >> --- a/libavcodec/x86/v210enc.asm >> +++ b/libavcodec/x86/v210enc.asm >> -cextern pb_FE >> -%define v210_enc_max_8 pb_FE >> +;cextern pb_FE >> +local_pb_FE: times 32 db 0xfe >> +%define v210_enc_max_8 local_pb_FE > > You could change ff_pb_FE to be 32-byte instead of duplicating it.
I can change that. >> +%if cpuflag(avx2) >> + movu xm1, [yq+widthq*2] >> + vinserti128 m1, m1, [yq+widthq*2+12], 1 >> +%else >> movu m1, [yq+2*widthq] >> +%endif > > xmN can be used unconditionally which gets rid of the %else. E.g. > > movu xm1, [yq+widthq*2] > %if cpuflag(avx2) > vinserti128 m1, m1, [yq+widthq*2+12], 1 > %endif I can change that. I slightly prefer to not mix register sizes like that but it seems unavoidable with avx2. >> +%if cpuflag(avx2) >> + movq xm3, [uq+widthq] >> + movhps xm3, [vq+widthq] >> + movq xm7, [uq+widthq+6] >> + movhps xm7, [vq+widthq+6] >> + vinserti128 m3, m3, xm7, 1 >> +%else >> movq m3, [uq+widthq] >> movhps m3, [vq+widthq] >> +%endif > > Ditto. Also use xm2 instead of xm7 since it's unused at this point and > it avoids having to use an extra vector register in the AVX2 version. Thanks. >> +%if cpuflag(avx2) >> + movu [dstq], xm0 >> + movu [dstq+16], xm1 >> + vextracti128 [dstq+32], m0, 1 >> + vextracti128 [dstq+48], m1, 1 >> +%else >> movu [dstq], m0 >> movu [dstq+mmsize], m1 >> +%endif > > Ditto. > > Otherwise LGTM. Noted.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel