https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104441
Bug ID: 104441
Summary: [12 Regression] vzeroupper is placed at the wrong
place
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, lili.cui at intel dot com
Target Milestone: ---
Target: x86-64
Created attachment 52375
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52375&action=edit
A testcase
When compiled with -march=skylake -Wno-attributes, GCC 12 generated:
.L3:
vmovd (%rdi), %xmm0
vmovd (%rdi,%r13), %xmm1
vpinsrd $1, (%rdi,%r12), %xmm1, %xmm1
vpinsrd $1, (%rdi,%rsi), %xmm0, %xmm0
vmovd (%rax,%rbx), %xmm2
vinserti128 $0x1, %xmm1, %ymm0, %ymm0
vmovd (%rax), %xmm1
vpinsrd $1, (%rax,%rcx), %xmm1, %xmm1
vpinsrd $1, (%rax,%r11), %xmm2, %xmm2
addl $4, %edx
vinserti128 $0x1, %xmm2, %ymm1, %ymm1
vpsadbw %ymm1, %ymm0, %ymm0
vpaddd %ymm0, %ymm3, %ymm0
vmovdqa %ymm0, %ymm3
addq %r10, %rdi
addq %r9, %rax
cmpl %r8d, %edx
jb .L3
vzeroupper <<<<<<<<<<< Clear upper 128bits.
popq %rbx
popq %r12
vextracti128 $0x1, %ymm3, %xmm3 << The upper 128bits of YMM3 are
used.
vpaddd %xmm3, %xmm0, %xmm0
popq %r13
vmovd %xmm0, %eax
popq %rbp
This is triggered by
commit 9775e465c1fbfc32656de77c618c61acf5bd905d
Author: H.J. Lu <[email protected]>
Date: Tue Jul 27 07:46:04 2021 -0700
x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register