On Thu, Feb 17, 2022 at 10:49:48AM +0100, Richard Biener via Gcc-patches wrote: > On Thu, Feb 17, 2022 at 8:52 AM Uros Bizjak via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy > > > > Bride, > > > > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX > > > > transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to > > > > generate vzeroupper instruction after loading all-zero YMM/YMM registers > > > > and enable it by default. > > > Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit > > > smoother? > > > Because originally we needed to add vzeroupper to all avx<->sse cases, > > > now it's a tune to indicate that we don't need to add it in some > > > > Perhaps we should go from the other side and use > > X86_TUNE_OPTIMIZE_AVX_READ for new processors? > > Btw, do you have a micro-benchmark to test this on AMD archs? >
I don't believe AMD CPUs needs vzeroupper. H.J.