Min Chen wrote:
> Sent: Thursday, September 30, 2021 10:29 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v2 3/4] libswscale/x86/rgb2rgb: add
> uyvytoyuv422 avx2
> 
> Hello,
> 
> >+pb_shuffle_low: times 4 db 1, 3, 5, 7, 9, 11, 13, 15, -1, -1, -1, -1,
> >+-1, -1, -1, -1
> Why we times 4?
> AVX2 provided instruction VPBROADCASTQ to load these constant into SIMD
> register.
> 
> Moreover, the plane U/V also apply same algorithm to get improve.
> 
> Regards,
> Min Chen
> 
Hi Min Chen,

Much appreciated your helpful suggestions. 

Correct! It's not necessary to use time 4 here.  It's funny that I did try to 
avoid using it here
when writing the codes and get no way because I ignored the VBROADCASTI128 
instruction.

About the UV extracting, I have estimated the new method before making a 
decision to keep
using the masterpiece of the previous author. The former is better, and pand 
instruction has a better
reciprocal throughput, or issue latency.

Best regards,
Jianhua

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to