https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232

            Bug ID: 117232
           Summary: EQ/NE comparison between avx512 kmask and -1 can be
                    optimized with kxortest with checking CF when using
                    cmov
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-linux-gnu

This is expansion of PR 113609 which showed when I improved phiopt's factor
operations to handle more than just 1 operand operations.

New reduced testcase that fails to use kortestw (even without my phiopt
improvements):
```
#include <immintrin.h>

int
cmp_vector_je_mask64_t(__m512i a, __m512i b, int c, int d) {
    __mmask64 k = _mm512_cmpeq_epi8_mask (a, b);
    return k == (__mmask64) -1 ? c : d;
}
```

GCC currently produces:
```
        vpcmpb  $0, %zmm1, %zmm0, %k0
        kmovq   %k0, %rax
        cmpq    $-1, %rax
        movl    %edi, %eax
        cmovne  %esi, %eax
        ret
```

While LLVM produces:
```
        movl    %edi, %eax
        vpcmpneqd       %zmm1, %zmm0, %k0
        kortestw        %k0, %k0
        cmovnel %esi, %eax
        vzeroupper
        retq
```

Reply via email to