Re: [FFmpeg-devel] [PATCH v3] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes

2025-01-29 Thread Shreesh Adiga
_avx2:16.2 ( 5.49x) shuffle_bytes_3210_avx512icl:9.2 ( 9.65x) I can add the details to commit message if you can confirm if it is needed. Thanks, Shreesh On Wed, Jan 29, 2025 at 5:46 PM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote: > Shreesh Adiga:

[FFmpeg-devel] [PATCH v3] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes

2025-01-28 Thread Shreesh Adiga
Signed-off-by: Shreesh Adiga <16567adigashre...@gmail.com> --- v3: Fix build failure on older nasm by replacing "kmovw k, tmpw" with "kmov k, tmpd" which matches "kmovw k, r32" syntax. v2: Tried to align operands and improve indentation for ASM routine

[FFmpeg-devel] [PATCH v2] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes

2025-01-25 Thread Shreesh Adiga
Signed-off-by: Shreesh Adiga <16567adigashre...@gmail.com> --- v2: Tried to align operands and improve indentation for ASM routine. libswscale/x86/rgb2rgb.c | 21 + libswscale/x86/rgb_2_rgb.asm | 90 +++- 2 files changed, 80 insertions(+), 31 del

Re: [FFmpeg-devel] [PATCH] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes

2025-01-25 Thread Shreesh Adiga
> Try running it several times using the same seed, so > "tests/checkasm/checkasm --test=sw_rgb --bench 17575157", and make sure > no power saving feature is enabled (so the CPU frequency doesn't change > based on load). That may help getting consistent results. After running "echo performance | t

Re: [FFmpeg-devel] [PATCH] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes

2025-01-25 Thread Shreesh Adiga
> Thanks for the patch. Could you please compile and run > tests/checkasm/checkasm with "--test=sw_rgb --bench" and paste the > results for the shuffle_bytes functions, to see if there's a speed up > compared to the AVX2 implementation? I ran the command "tests/checkasm/checkasm --test=sw_rgb --be

[FFmpeg-devel] [PATCH] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes

2025-01-25 Thread Shreesh Adiga
Signed-off-by: Shreesh Adiga <16567adigashre...@gmail.com> --- libswscale/x86/rgb2rgb.c | 21 + libswscale/x86/rgb_2_rgb.asm | 28 2 files changed, 49 insertions(+) diff --git a/libswscale/x86/rgb2rgb.c b/libswscale/x86/rgb2rgb.c