https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81602
Hongtao Liu <liuhongt at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |liuhongt at gcc dot gnu.org --- Comment #5 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #4) > Interesting clang does: > ``` > movzx ecx, word ptr [rdi + 2*rax] > popcnt ecx, ecx > lea rsi, [rsi + 2*rcx] > ``` > > > While GCC 14+ does: > ``` > xor eax, eax > add rdi, 2 > mov WORD PTR [rsi], ax > popcnt ax, WORD PTR [rdi-2] > and eax, 31 > ``` > > So clang has a zero extend before the popcount while GCC has it afterwards > ... I guess zero_extend before popcnt is used to clear popcnt false dependence(use same register for source and dest, gcc use xor there), and clang knows the upper bits of popcnt must be zero, so there's no zero_extend afterwards. Maybe we should simplify that at rtl for zero_extend:popcnt or and:popcnt, imm