On 11/14/2022 2:58 AM, Wang, Bin wrote:
-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of James Almer
Sent: Monday, November 14, 2022 10:43 AM
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH v7] libavfilter/x86/vf_convolution: add 
sobel filter optimization and unit test with intel AVX512 VNNI

On 11/4/2022 5:29 AM, bin.wang-at-intel....@ffmpeg.org wrote:
+%macro FILTER_SOBEL 0
+%if UNIX64
+cglobal filter_sobel, 4, 15, 7, dst, width, matrix, ptr, c0, c1, c2,
+c3, c4, c5, c6, c7, c8, r, x %else cglobal filter_sobel, 4, 15, 7,
+dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7,
+c8, r, x %endif %if WIN64
+    SWAP xmm0, xmm2
+    SWAP xmm1, xmm3
+    mov  r2q, matrixmp
+    mov  r3q, ptrmp
+    DEFINE_ARGS dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5, c6,
+c7, c8, r, x %endif
+    movsxdifnidn widthq, widthd
+    VBROADCASTSS m0, xmm0
+    VBROADCASTSS m1, xmm1

+ This and every other xmm# case should instead be xm#, to ensure the swapping 
is taken into account.

Sorry, I can't get your point, could you please help to explain why I have to 
use xm# to ensure the swapping operation(swap xmm# can't work in WIN64 asm)? 
And How to do it ?

SWAP only affects the x86inc defined macros m#, xm#, ym#, and zm#, so those instructions above end up encoded as vbroadcastss zmm2, xmm0 and
vbroadcastss zmm3, xmm1 on WIN64.
In fact, now that i check it they end up as vbroadcastss zmm18, xmm0 and vbroadcastss zmm19, xmm1 because x86inc is purposely using the higher 16 regs with these macros on all targets to avoid having to call vzeroupper at the end. This works on unix64 by pure chance because the floats were effectively in xmm0 and xmm1 and all calculations then happen on m#, xm# and ym#.

So you'll have to duplicate the VBROADCASTSS lines to broadcast xmm2 and xmm3 to m0 and m1 on WIN64 instead of using SWAP.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to