https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232
Bug ID: 117232 Summary: EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF when using cmov Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64-linux-gnu This is expansion of PR 113609 which showed when I improved phiopt's factor operations to handle more than just 1 operand operations. New reduced testcase that fails to use kortestw (even without my phiopt improvements): ``` #include <immintrin.h> int cmp_vector_je_mask64_t(__m512i a, __m512i b, int c, int d) { __mmask64 k = _mm512_cmpeq_epi8_mask (a, b); return k == (__mmask64) -1 ? c : d; } ``` GCC currently produces: ``` vpcmpb $0, %zmm1, %zmm0, %k0 kmovq %k0, %rax cmpq $-1, %rax movl %edi, %eax cmovne %esi, %eax ret ``` While LLVM produces: ``` movl %edi, %eax vpcmpneqd %zmm1, %zmm0, %k0 kortestw %k0, %k0 cmovnel %esi, %eax vzeroupper retq ```