This is an automated email from the git hooks/post-receive script.

Git pushed a commit to branch master
in repository ffmpeg.

The following commit(s) were added to refs/heads/master by this push:
     new f43f609bb8 avutil/x86/tx_float: add missing vzeroupper to 15xM PFA FFT
f43f609bb8 is described below

commit f43f609bb8a288e62a954dbde497ec6d0e16cae6
Author:     Kacper Michajłow <[email protected]>
AuthorDate: Fri Jun 5 20:11:46 2026 +0200
Commit:     Kacper Michajłow <[email protected]>
CommitDate: Tue Jun 9 17:54:21 2026 +0000

    avutil/x86/tx_float: add missing vzeroupper to 15xM PFA FFT
    
    The AVX2 15xM PFA FFT calls its second-dimension subtransform with dirty
    YMM. That subtransform may be a legacy-SSE codelet (fft4 is SSE2 only),
    causing AVX<->SSE transition penalties. Clear them after the first
    dimension, before the calls.
    
    Detected with `sde64 -ast` FATE job.
    
    Fixes: ace42cf581f8c06872bfb58cf575d9e8bd398c0a
---
 libavutil/x86/tx_float.asm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libavutil/x86/tx_float.asm b/libavutil/x86/tx_float.asm
index 87be21c2d6..7dedf54312 100644
--- a/libavutil/x86/tx_float.asm
+++ b/libavutil/x86/tx_float.asm
@@ -1874,6 +1874,8 @@ cglobal fft_pfa_15xM_float, 4, 14, 16, 320, ctx, out, in, 
stride, len, lut, buf,
     mov lutq, [ctxq + AVTXContext.map]              ; load subtransform's map
     movsxd lenq, dword [ctxq + AVTXContext.len]     ; load subtransform's 
length
 
+    vzeroupper
+
 .dim2:
     call tgt5q                                      ; call the FFT
     lea inq,  [inq  + lenq*8]

_______________________________________________
ffmpeg-cvslog mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to