This is an automated email from the git hooks/post-receive script.
Git pushed a commit to branch master
in repository ffmpeg.
The following commit(s) were added to refs/heads/master by this push:
new f43f609bb8 avutil/x86/tx_float: add missing vzeroupper to 15xM PFA FFT
f43f609bb8 is described below
commit f43f609bb8a288e62a954dbde497ec6d0e16cae6
Author: Kacper Michajłow <[email protected]>
AuthorDate: Fri Jun 5 20:11:46 2026 +0200
Commit: Kacper Michajłow <[email protected]>
CommitDate: Tue Jun 9 17:54:21 2026 +0000
avutil/x86/tx_float: add missing vzeroupper to 15xM PFA FFT
The AVX2 15xM PFA FFT calls its second-dimension subtransform with dirty
YMM. That subtransform may be a legacy-SSE codelet (fft4 is SSE2 only),
causing AVX<->SSE transition penalties. Clear them after the first
dimension, before the calls.
Detected with `sde64 -ast` FATE job.
Fixes: ace42cf581f8c06872bfb58cf575d9e8bd398c0a
---
libavutil/x86/tx_float.asm | 2 ++
1 file changed, 2 insertions(+)
diff --git a/libavutil/x86/tx_float.asm b/libavutil/x86/tx_float.asm
index 87be21c2d6..7dedf54312 100644
--- a/libavutil/x86/tx_float.asm
+++ b/libavutil/x86/tx_float.asm
@@ -1874,6 +1874,8 @@ cglobal fft_pfa_15xM_float, 4, 14, 16, 320, ctx, out, in,
stride, len, lut, buf,
mov lutq, [ctxq + AVTXContext.map] ; load subtransform's map
movsxd lenq, dword [ctxq + AVTXContext.len] ; load subtransform's
length
+ vzeroupper
+
.dim2:
call tgt5q ; call the FFT
lea inq, [inq + lenq*8]
_______________________________________________
ffmpeg-cvslog mailing list -- [email protected]
To unsubscribe send an email to [email protected]