Min Chen wrote: > At 2021-09-30 15:23:08, "Wu, Jianhua" <jianhua...@intel.com> wrote: > >Min Chen wrote: > >> Sent: Thursday, September 30, 2021 10:29 AM > >> To: FFmpeg development discussions and patches <ffmpeg- > >> de...@ffmpeg.org> > >> Subject: Re: [FFmpeg-devel] [PATCH v2 3/4] libswscale/x86/rgb2rgb: > >> add > >> uyvytoyuv422 avx2 > >> > >> Hello, > >> > >> >+pb_shuffle_low: times 4 db 1, 3, 5, 7, 9, 11, 13, 15, -1, -1, -1, > >> >+-1, -1, -1, -1, -1 > >> Why we times 4? > >> AVX2 provided instruction VPBROADCASTQ to load these constant into > >> SIMD register. > >> > >> Moreover, the plane U/V also apply same algorithm to get improve. > >> > >> Regards, > >> Min Chen > >> > >Hi Min Chen, > > > >Much appreciated your helpful suggestions. > > > >Correct! It's not necessary to use time 4 here. It's funny that I did > >try to avoid using it here when writing the codes and get no way because I > ignored the VBROADCASTI128 instruction. > > > >About the UV extracting, I have estimated the new method before making > >a decision to keep using the masterpiece of the previous author. The > >former is better, and pand instruction has a better reciprocal throughput, or > issue latency. > > > >Best regards, > >Jianhua > > > > For VBROADCASTI128, we don't care high part of result, so we just need > lowest 64-bits constant table. VPBROADCASTQ enough. > >
Definitely make sense. Thanks. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".