https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #59 from Tamar Christina <tnfchris at gcc dot gnu.org> --- I've sent two patches upstream this morning to fix the remaining ifcvt issues: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623848.html https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623849.html This brings us within 5% of GCC-12, but not all the way there, the reason is that since GCC-13 PRE behaves differently. In GCC-12 after PRE we'd have the following CFG: <bb 15> [local count: 623751662]: _16 = distbb_79 * iftmp.1_100; iftmp.8_80 = 1.0e+0 - _16; _160 = chrg_init_75 * iftmp.8_80; <bb 16> [local count: 1057206200]: # iftmp.8_39 = PHI <iftmp.8_80(15), 1.0e+0(14)> # prephitmp_161 = PHI <_160(15), chrg_init_75(14)> if (distbb_79 < iftmp.0_96) goto <bb 17>; [50.00%] else goto <bb 18>; [50.00%] <bb 17> [local count: 528603100]: _164 = ABS_EXPR <prephitmp_161>; _166 = -_164; <bb 18> [local count: 1057206200]: # iftmp.9_40 = PHI <1.0e+0(17), 0.0(16)> # prephitmp_163 = PHI <prephitmp_161(17), 0.0(16)> # prephitmp_167 = PHI <_166(17), 0.0(16)> if (iftmp.2_38 != 0) goto <bb 20>; [50.00%] else goto <bb 19>; [50.00%] <bb 19> [local count: 528603100]: <bb 20> [local count: 1057206200]: # iftmp.10_41 = PHI <prephitmp_167(18), prephitmp_163(19)> That is to say, in both branches we always do the multiply, gimple-isel then correctly turns this into a COND_MUL based on the mask. Since GCC-13 PRE now does some extra optimizations: <bb 15> [local count: 1057206200]: # l_107 = PHI <l_84(21), 0(14)> _13 = lpos_x[l_107]; x_72 = _13 - p_atom$x_81; powmult_73 = x_72 * x_72; distbb_74 = powmult_73 - radij_58; if (distbb_74 >= 0.0) goto <bb 17>; [59.00%] else goto <bb 16>; [41.00%] <bb 16> [local count: 433454538]: _165 = ABS_EXPR <chrg_init_70>; _168 = -_165; goto <bb 19>; [100.00%] <bb 17> [local count: 623751662]: _14 = distbb_74 * iftmp.1_101; iftmp.8_76 = 1.0e+0 - _14; if (distbb_74 < iftmp.0_97) goto <bb 18>; [20.00%] else goto <bb 19>; [80.00%] <bb 18> [local count: 124750334]: _162 = chrg_init_70 * iftmp.8_76; _164 = ABS_EXPR <_162>; _167 = -_164; <bb 19> [local count: 1057206200]: # iftmp.9_38 = PHI <1.0e+0(18), 0.0(17), 1.0e+0(16)> # iftmp.8_102 = PHI <iftmp.8_76(18), iftmp.8_76(17), 1.0e+0(16)> # prephitmp_163 = PHI <_162(18), 0.0(17), chrg_init_70(16)> # prephitmp_169 = PHI <_167(18), 0.0(17), _168(16)> if (iftmp.2_36 != 0) goto <bb 21>; [50.00%] else goto <bb 20>; [50.00%] That is to say, the multiplication is now compleletely skipped in one branch, this should be better for scalar code, but for vector we have to do the multiplication anyway. after ifcvt we end up with: _162 = chrg_init_70 * iftmp.8_76; _164 = ABS_EXPR <_162>; _167 = -_164; _ifc__166 = distbb_74 < iftmp.0_97 ? _167 : 0.0; prephitmp_169 = distbb_74 >= 0.0 ? _ifc__166 : _168; instead of _160 = chrg_init_75 * iftmp.8_80; prephitmp_161 = distbb_79 < 0.0 ? chrg_init_75 : _160; _164 = ABS_EXPR <prephitmp_161>; _166 = -_164; prephitmp_167 = distbb_79 < iftmp.0_96 ? _166 : 0.0; previously we'd make COND_MUL and COND_NEG and so don't need a VCOND in the end, now we select after the multiplication, so we only have a COND_NEG followed by a VCOND. This is obviously worse, but I have no idea how to recover it. Any ideas?