https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68695
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- On x86_64 the testcase produces foo: .LFB0: .cfi_startproc cmpl %esi, %edi movl %edx, %ecx movl %edx, %eax cmovle %esi, %ecx cmovle %edi, %eax imull %ecx, %eax ret we expand from # i_1 = PHI <x_3(D)(2), a_5(D)(3)> # j_2 = PHI <y_4(D)(2), a_5(D)(3)> _6 = i_1 * j_2; return _6; and on x86_64 we coalesce i_1 with x_3 and j_2 with y_4 which would match your "before change" generated code (copy a_5 to both other regs). So RTL ifcvt dumps are required here, too.