https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565
Segher Boessenkool <segher at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |segher at gcc dot gnu.org --- Comment #1 from Segher Boessenkool <segher at gcc dot gnu.org> --- Well, on power9 I get just cmpdi 0,3,0 beq 0,.L2 cnttzd 3,3 sldi 9,3,2 lwzx 9,4,9 or 3,9,3 stw 3,0(4) .L2: li 3,0 blr so it is more than just CTZ_DEFINED_VALUE_AT_ZERO = 2 . (Also on power7, power8, but those don't have that neat ctz insn). On aarch64, combine starts with insn_cost 4 for 43: r106:DI=x0:DI REG_DEAD x0:DI insn_cost 4 for 2: r98:DI=r106:DI REG_DEAD r106:DI insn_cost 4 for 44: r107:DI=x1:DI REG_DEAD x1:DI insn_cost 4 for 3: r99:DI=r107:DI REG_DEAD r107:DI insn_cost 4 for 7: cc:CC=cmp(r98:DI,0) insn_cost 4 for 8: pc={(cc:CC==0)?L17:pc} REG_DEAD cc:CC REG_BR_PROB 536870916 insn_cost 4 for 10: r100:DI=ctz(r98:DI) REG_DEAD r98:DI insn_cost 4 for 12: r101:DI=sign_extend(r100:DI#0) insn_cost 16 for 14: r104:SI=[r101:DI*0x4+r99:DI] REG_DEAD r101:DI insn_cost 4 for 15: r103:SI=r104:SI|r100:DI#0 REG_DEAD r104:SI REG_DEAD r100:DI insn_cost 4 for 16: [r99:DI]=r103:SI REG_DEAD r103:SI REG_DEAD r99:DI insn_cost 4 for 23: x0:DI=0 insn_cost 0 for 24: use x0:DI r100 (set in 10) is used later, just like r101 (set in 12). Trying 10 -> 12: 10: r100:DI=ctz(r98:DI) REG_DEAD r98:DI 12: r101:DI=sign_extend(r100:DI#0) Successfully matched this instruction: (set (reg:DI 100) (ctz:DI (reg/v:DI 98 [ x ]))) Successfully matched this instruction: (set (reg:DI 101 [ _9 ]) (ctz:DI (reg/v:DI 98 [ x ]))) allowing combination of insns 10 and 12 original costs 4 + 4 = 8 replacement costs 4 + 4 = 8 So, it is *not* duplicating the ctz: the duplicate was already there to start with, in some sense.