On 05/02/15 4:20 PM, Christophe Gisquet wrote: > From: plepere <pierre-edouard.lep...@insa-rennes.fr>
This should probably be changed to Pierre Edouard Lepere. > +%if cpuflag(avx2) && (%0 == 3) > + > + vextracti128 xm10, m0, 1 > + vinserti128 m10, m1, xm10, 0 > + vinserti128 m0, m0, xm1, 1 > + mova m1, m10 > + > + vextracti128 xm10, m2, 1 > + vinserti128 m10, m3, xm10, 0 > + vinserti128 m2, m2, xm3, 1 > + mova m3, m10 > + > + > + vextracti128 xm10, m4, 1 > + vinserti128 m10, m5, xm10, 0 > + vinserti128 m4, m4, xm5, 1 > + mova m5, m10 > + > + vextracti128 xm10, m6, 1 > + vinserti128 m10, m7, xm10, 0 > + vinserti128 m6, m6, xm7, 1 > + mova m7, m10 > +%endif I didn't check but i think these can be simplified using vperm2i128. It can be done in a separate patch anyway. > @@ -619,6 +761,89 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int > bit_depth) > c->idct_dc[3] = ff_hevc_idct32x32_dc_8_avx2; > if (ARCH_X86_64) { > SAO_BAND_INIT(8, avx2); > + c->put_hevc_epel[7][0][0] = > ff_hevc_put_hevc_pel_pixels32_8_avx2; > + c->put_hevc_epel[8][0][0] = > ff_hevc_put_hevc_pel_pixels48_8_avx2; > + c->put_hevc_epel[9][0][0] = > ff_hevc_put_hevc_pel_pixels64_8_avx2; [...] It would be nice all this was compressed to a couple macros like with SSE4. But that's cosmetics and not a blocker. > } > > c->transform_add[2] = ff_hevc_transform_add16_10_avx2; > Should be ok if it passes fate and compiles with yasm <= 1.1.0 (there are C wrappers and those usually need more strict checks for HAVE_AVX2_EXTERNAL because dead code elimination doesn't seem to trigger until after pre-processing is done). _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel