https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102540
--- Comment #5 from rguenther at suse dot de <rguenther at suse dot de> --- On Fri, 1 Oct 2021, amacleod at redhat dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102540 > > --- Comment #4 from Andrew Macleod <amacleod at redhat dot com> --- > > > (In reply to Richard Biener from comment #2) > > FRE1 has the following difference, simplifying the (unsigned int) > > truncation. > > > > <bb 2> : > > a.0_1 = a; > > _2 = (unsigned int) a.0_1; > > b = _2; > > - c_10 = (long int) _2; > > + _6 = a.0_1 & 4294967295; > > + c_10 = _6; > > if (c_10 != 0) > > goto <bb 3>; [INV] > > else > > > > Why does FRE make this transformation/simplification? It's a match.pd transform that transforms a zero-extend from a smaller precision via two NOP_EXPRs to a single BIT_AND_EXPR which is better and more canonical on GIMPLE. > It removes a > relationship between c_10 and _2. The reason ranger no longer can fold _2 == 0 > is because the sequence is now: > > a.0_1 = a; > _2 = (unsigned int) a.0_1; > b = _2; > _6 = a.0_1 & 4294967295; > c_10 = _6; > if (c_10 != 0) > goto <bb 3>; [INV] > > We do not find _2 is non-zero on the outgoing edge because _2 is not related > to > the calculation in the condition. (ie c_10 no longer has a dependency on _2) > > We do recalculate _2 based on the outgoing range of a.0_1, but with it being a > 64 bit value and _2 being 32 bits, we only know the outgoing range of a.0_1 is > non-zero.. we dont track any of the upper bits... > 2->3 (T) a.0_1 : long int [-INF, -1][1, +INF] > And when we recalculate _2 using that value, we still get varying because > 0xFFFF0000 in not zero, but can still produce a zero in _2. > > The problem is that the condition c_10 != 0 no longer related to the value of > _2 in the IL... so ranger never sees it. and we cant represent the 2^16 > subranges that end in [1,0xFFFF]. > > Before that transformation, > _2 = (unsigned int) a.0_1; > b = _2; > c_10 = (long int) _2; > The relationship is obvious, and ranger would relate the c_10 != 0 to _2 no > problem. I see - too bad. Note the transform made the dependence chain of _6 one instruction shorter without increasing the number of instructions so it's a profitable transform. Btw, the relation is still there but only indirectly via a.0_1. The old (E)VRP had this find_asserts(?) that produced assertions based on the definitions - sth that now range-ops does(?), so it would eventually have built assertions for a.0_1 for both conditions and allow relations based on that? I can't seem to find my way around the VRP code now - pieces moved all over the place and so my mind fails me on the searching task :/