https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
We have

  if (_1 < 0.0)


  # PHI < .., ..>  // the if above only controls which PHI arg we take
  ... code ...
  if (_1 < 1.0e+0)

  # PHI < .., ...> // likewise

and are threading _1 < 0.0 -> _1 < 1.0e+0

So on the _1 < 0.0 path we are eliding one conditional jump.  The
main pessimization would be that we now have an additional entry
to the 2nd PHI, but with the same value as the _1 < 1.0 path, so
a forwarder would be able to "solve" that IL detail.

The only heuristic I can imagine doing is to avoid extra entries
into a diamond that's really just a simple COND_EXPR.

What's odd is that with -fno-thread-jumps it's the dom2 pass
optimizes the branching of the first compare:

   _1 = (float) l_21;
   _2 = _1 < 0.0;
   zone1_15 = (int) _2;
-  if (_1 < 0.0)
-    goto <bb 4>; [41.00%]
-  else
-    goto <bb 5>; [59.00%]
-
-  <bb 4> [local count: 391808389]:
-
-  <bb 5> [local count: 955630225]:
-  # iftmp.0_10 = PHI <zone1_15(4), 1(3)>
   fasten_main_natpro_chrg_init.2_3 = fasten_main_natpro_chrg_init;
-  _4 = fasten_main_natpro_chrg_init.2_3 * iftmp.0_10;
-  _5 = (float) _4;
+  _4 = fasten_main_natpro_chrg_init.2_3;
+  _5 = (float) fasten_main_natpro_chrg_init.2_3;

but we fail to see this opportunity earlier (maybe the testcase is too
simplified?).  When we thread the jump this simplification opportunity
is lost.

I wonder if exactly how DOM handles this - it does

Visiting conditional with predicate: if (_1 < 0.0)

With known ranges
        _1: [frange] float VARYING +-NAN

Predicate evaluates to: DON'T KNOW
LKUP STMT _1 lt_expr 0.0
FIND: _2
  Replaced redundant expr '_1 < 0.0' with '_2'
0>>> COPY _2 = 0
<<<< COPY _2 = 0


Optimizing block #4

1>>> STMT 1 = _1 ordered_expr 0.0
1>>> STMT 1 = _1 ltgt_expr 0.0
1>>> STMT 1 = _1 le_expr 0.0
1>>> STMT 1 = _1 ne_expr 0.0
1>>> STMT 0 = _1 eq_expr 0.0
1>>> STMT 0 = truth_not_expr _1 < 0.0
0>>> COPY _2 = 1
Match-and-simplified (int) _2 to 1
0>>> COPY zone1_15 = 1

how does it go backwards adjusting zone1_15?!

Anyhow - EVRP doesn't seem to handle any of this (replacing PHI arguments
by values on edges to see if the PHI becomes singleton, or even handling
the PHI "properly"):

Visiting conditional with predicate: if (_1 < 0.0)

With known ranges
        _1: [frange] float VARYING +-NAN

Predicate evaluates to: DON'T KNOW
Not folded
Global Exported: iftmp.0_11 = [irange] int [0, 1] NONZERO 0x1
Folding PHI node: iftmp.0_11 = PHI <zone1_17(4), 1(3)>
No folding possible

ah, probably it's the missing CSE there:

    <bb 3> :
    _1 = (float) l_10;
    _2 = _1 < 0.0;
    zone1_17 = (int) _2;
    if (_1 < 0.0)

we are not considering to replace the FP compare control if (_1 < 0.0)
with an integer compare control if (_2 != 0).  Maybe we should do that?

So to me it doesn't look like a bug in jump threading but at most a
phase ordering issue or an early missed optimization.

Yes, we could eventually tame down jump threading with some additional
heuristic.  But IMHO optimizing the above earlier would be better?

Reply via email to