On 11/14/2022 2:58 AM, Wang, Bin wrote:
-----Original Message----- From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of James Almer Sent: Monday, November 14, 2022 10:43 AM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v7] libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNIOn 11/4/2022 5:29 AM, bin.wang-at-intel....@ffmpeg.org wrote:+%macro FILTER_SOBEL 0 +%if UNIX64 +cglobal filter_sobel, 4, 15, 7, dst, width, matrix, ptr, c0, c1, c2, +c3, c4, c5, c6, c7, c8, r, x %else cglobal filter_sobel, 4, 15, 7, +dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, c7, +c8, r, x %endif %if WIN64 + SWAP xmm0, xmm2 + SWAP xmm1, xmm3 + mov r2q, matrixmp + mov r3q, ptrmp + DEFINE_ARGS dst, width, matrix, ptr, c0, c1, c2, c3, c4, c5, c6, +c7, c8, r, x %endif + movsxdifnidn widthq, widthd + VBROADCASTSS m0, xmm0 + VBROADCASTSS m1, xmm1+ This and every other xmm# case should instead be xm#, to ensure the swapping is taken into account.Sorry, I can't get your point, could you please help to explain why I have to use xm# to ensure the swapping operation(swap xmm# can't work in WIN64 asm)? And How to do it ?
SWAP only affects the x86inc defined macros m#, xm#, ym#, and zm#, so those instructions above end up encoded as vbroadcastss zmm2, xmm0 and
vbroadcastss zmm3, xmm1 on WIN64.In fact, now that i check it they end up as vbroadcastss zmm18, xmm0 and vbroadcastss zmm19, xmm1 because x86inc is purposely using the higher 16 regs with these macros on all targets to avoid having to call vzeroupper at the end. This works on unix64 by pure chance because the floats were effectively in xmm0 and xmm1 and all calculations then happen on m#, xm# and ym#.
So you'll have to duplicate the VBROADCASTSS lines to broadcast xmm2 and xmm3 to m0 and m1 on WIN64 instead of using SWAP.
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".