https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- We have if (_1 < 0.0) # PHI < .., ..> // the if above only controls which PHI arg we take ... code ... if (_1 < 1.0e+0) # PHI < .., ...> // likewise and are threading _1 < 0.0 -> _1 < 1.0e+0 So on the _1 < 0.0 path we are eliding one conditional jump. The main pessimization would be that we now have an additional entry to the 2nd PHI, but with the same value as the _1 < 1.0 path, so a forwarder would be able to "solve" that IL detail. The only heuristic I can imagine doing is to avoid extra entries into a diamond that's really just a simple COND_EXPR. What's odd is that with -fno-thread-jumps it's the dom2 pass optimizes the branching of the first compare: _1 = (float) l_21; _2 = _1 < 0.0; zone1_15 = (int) _2; - if (_1 < 0.0) - goto <bb 4>; [41.00%] - else - goto <bb 5>; [59.00%] - - <bb 4> [local count: 391808389]: - - <bb 5> [local count: 955630225]: - # iftmp.0_10 = PHI <zone1_15(4), 1(3)> fasten_main_natpro_chrg_init.2_3 = fasten_main_natpro_chrg_init; - _4 = fasten_main_natpro_chrg_init.2_3 * iftmp.0_10; - _5 = (float) _4; + _4 = fasten_main_natpro_chrg_init.2_3; + _5 = (float) fasten_main_natpro_chrg_init.2_3; but we fail to see this opportunity earlier (maybe the testcase is too simplified?). When we thread the jump this simplification opportunity is lost. I wonder if exactly how DOM handles this - it does Visiting conditional with predicate: if (_1 < 0.0) With known ranges _1: [frange] float VARYING +-NAN Predicate evaluates to: DON'T KNOW LKUP STMT _1 lt_expr 0.0 FIND: _2 Replaced redundant expr '_1 < 0.0' with '_2' 0>>> COPY _2 = 0 <<<< COPY _2 = 0 Optimizing block #4 1>>> STMT 1 = _1 ordered_expr 0.0 1>>> STMT 1 = _1 ltgt_expr 0.0 1>>> STMT 1 = _1 le_expr 0.0 1>>> STMT 1 = _1 ne_expr 0.0 1>>> STMT 0 = _1 eq_expr 0.0 1>>> STMT 0 = truth_not_expr _1 < 0.0 0>>> COPY _2 = 1 Match-and-simplified (int) _2 to 1 0>>> COPY zone1_15 = 1 how does it go backwards adjusting zone1_15?! Anyhow - EVRP doesn't seem to handle any of this (replacing PHI arguments by values on edges to see if the PHI becomes singleton, or even handling the PHI "properly"): Visiting conditional with predicate: if (_1 < 0.0) With known ranges _1: [frange] float VARYING +-NAN Predicate evaluates to: DON'T KNOW Not folded Global Exported: iftmp.0_11 = [irange] int [0, 1] NONZERO 0x1 Folding PHI node: iftmp.0_11 = PHI <zone1_17(4), 1(3)> No folding possible ah, probably it's the missing CSE there: <bb 3> : _1 = (float) l_10; _2 = _1 < 0.0; zone1_17 = (int) _2; if (_1 < 0.0) we are not considering to replace the FP compare control if (_1 < 0.0) with an integer compare control if (_2 != 0). Maybe we should do that? So to me it doesn't look like a bug in jump threading but at most a phase ordering issue or an early missed optimization. Yes, we could eventually tame down jump threading with some additional heuristic. But IMHO optimizing the above earlier would be better?