https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97156
Bug ID: 97156 Summary: Missed optimization [x86-64] tzcnt unnecessarily zeros out destination register Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: goldstein.w.n at gmail dot com Target Milestone: --- See: https://gcc.godbolt.org/z/Y591MW void __attribute__((noinline)) tzncnt_not_just_return(uint64_t v) { for (; v; ) { uint64_t i; i = _tzcnt_u64(v); v &= (v - 1); bench_do_not_optimize_out(i); } bench_flush_all_pending(); } compiles to tzncnt_not_just_return(unsigned long): jmp .L11 .L6: xor eax, eax tzcnt rax, rdi blsr rdi, rdi .L11: test rdi, rdi jne .L6 ret the "xor eax, eax" isnt needed.