[Bug target/95791] New: Unnecessary vzeroupper when only using zmm16 through zmm31

josephcsible at gmail dot com Sat, 20 Jun 2020 13:01:48 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95791


            Bug ID: 95791
           Summary: Unnecessary vzeroupper when only using zmm16 through
                    zmm31
           Product: gcc
           Version: 10.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, ssemmx
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: josephcsible at gmail dot com
  Target Milestone: ---
            Target: x86_64-linux-gnu

Consider this C code:

void f(void) {
    __asm__ __volatile__("" ::: "zmm16");
}

When compiled with "-O2 -mavx512f", it generates a vzeroupper instruction, but
this is unnecessary, since zmm16 through zmm31 don't cause the performance
penalty, and in fact they aren't even affected by vzeroupper.

[Bug target/95791] New: Unnecessary vzeroupper when only using zmm16 through zmm31

Reply via email to