Jakub Jelinek <ja...@redhat.com> writes: > Hi! > > The following testcase is miscompiled during evrp. > Before vrp, we have (from ccp): > # RANGE [irange] long long unsigned int [0, +INF] MASK 0xffffffffffffc000 > VALUE 0x2d > _3 = _2 + 18446744073708503085; > ... > # RANGE [irange] long long unsigned int [0, +INF] MASK 0xffffffffffffc000 > VALUE 0x59 > _6 = (long long unsigned int) _5; > # RANGE [irange] int [-INF, +INF] MASK 0xffffc000 VALUE 0x34 > _7 = k_11 + -1048524; > switch (_7) <default: <L5> [33.33%], case 8: <L7> [33.33%], case 24: <L6> > [33.33%], case 32: <L6> [33.33%]> > ... > # RANGE [irange] long long unsigned int [0, +INF] MASK 0xffffffffffffc07d > VALUE 0x0 > # i_20 = PHI <_3(4), 0(3), _6(2)> > and evrp is now trying to figure out range for i_20 in range_of_phi. > > All the ranges and MASK/VALUE pairs above are correct for the testcase, > k_11 and _2 based on it is a result of multiplication by a constant with low > 14 bits cleared and then some numbers are added to it. > > There is an obvious missed optimization for which I've filed PR119039, > simplify_switch_using_ranges could see that all the labels but default > are unreachable because the controlling expression has > MASK 0xffffc000 VALUE 0x34 and none of 8, 24 and 32 satisfy that. > > Anyway, during range_of_phi for i_20, we process the PHI arguments > in order. For the _3(4) case, we figure out that it is reachable > through the case 24: case 32: labels only of the switch and that > 0x34 - 0x2d is 7, so derive > [irange] long long unsigned int [17, 17][25, 25] MASK 0xffffffffffffc000 > VALUE 0x2d > (the MASK/VALUE just got inherited from the _3 earlier range). > Now (not suprisingly because those labels aren't actually reachable), > that range is inconsistent, 0x2d is 45, so there is conflict between the > values and the irange_bitmask. > value-range.{h,cc} code differentiates between actually stored > irange_bitmask, which is that MASK 0xffffffffffffc000 VALUE 0x2d, and > semantic bitmask, which is what get_bitmask returns. That is > // The mask inherent in the range is calculated on-demand. For > // example, [0,255] does not have known bits set by default. This > // saves us considerable time, because setting it at creation incurs > // a large penalty for irange::set. At the time of writing there > // was a 5% slowdown in VRP if we kept the mask precisely up to date > // at all times. Instead, we default to -1 and set it when > // explicitly requested. However, this function will always return > // the correct mask. > // > // This also means that the mask may have a finer granularity than > // the range and thus contradict it. Think of the mask as an > // enhancement to the range. For example: > // > // [3, 1000] MASK 0xfffffffe VALUE 0x0 > // > // 3 is in the range endpoints, but is excluded per the known 0 bits > // in the mask. > // > // See also the note in irange_bitmask::intersect. > irange_bitmask bm > = get_bitmask_from_range (type (), lower_bound (), upper_bound ()); > if (!m_bitmask.unknown_p ()) > bm.intersect (m_bitmask); > Now, get_bitmask_from_range here is MASK 0x1f VALUE 0x0 and it intersects > that with that MASK 0xffffffffffffc000 VALUE 0x2d. > Which triggers the ugly special case in irange_bitmask::intersect: > // If we have two known bits that are incompatible, the resulting > // bit is undefined. It is unclear whether we should set the entire > // range to UNDEFINED, or just a subset of it. For now, set the > // entire bitmask to unknown (VARYING). > if (wi::bit_and (~(m_mask | src.m_mask), > m_value ^ src.m_value) != 0) > { > unsigned prec = m_mask.get_precision (); > m_mask = wi::minus_one (prec); > m_value = wi::zero (prec); > } > so the semantic bitmask is actually MASK 0xffffffffffffffff VALUE 0x0. > > Next, range_of_phi attempts to union it with the 0(3) PHI argument, > and during irange::union_ first adds the [0,0] to the subranges, so > [irange] long long unsigned int [0, 0][17, 17][25, 25] MASK > 0xffffffffffffc000 VALUE 0x2d > and then goes on to irange::union_bitmask which does > if (m_bitmask == r.m_bitmask) > return false; > irange_bitmask bm = get_bitmask (); > irange_bitmask save = bm; > bm.union_ (r.get_bitmask ()); > if (save == bm) > return false; > m_bitmask = bm; > if (save == get_bitmask ()) > return false; > m_bitmask MASK 0xffffffffffffc000 VALUE 0x2d isn't the same as > r.m_bitmask MASK 0x0 VALUE 0x0, so we compute the semantic bitmask > (but note, not from the original range before union, but the modified one, > dunno if that isn't a problem as well), which is still the VARYING/unknown_p > one, union_ that with MASK 0x0 VALUE 0x0 and get still > MASK 0xffffffffffffffff VALUE 0x0, so don't update anything, the semantic > bitmask didn't change, so we are fine (not!, see later). > > Except then we try to union with the third PHI argument. And, because the > edge to that comes only from case 8: label and there is a known difference > between the two, the argument is actually already from earlier replaced by > 45(2) constant. So, irange::union_ adds the [45, 45] range to the list > of subranges, but voila, 45 is 0x2d and satisfies the stored > MASK 0xffffffffffffc000 VALUE 0x2d and so the semantic bitmask changed to > from MASK 0xffffffffffffffff VALUE 0x0 to MASK 0xffffffffffffc000 VALUE 0x2d > by that addition. Eventually, we just optimize this to > [irange] long long unsigned int [45, 45] because that is the only range > which satisfies the bitmask. And that is wrong, at runtime i_20 has > value 0. > > The following patch attempts to detect this case where get_bitmask > turns some non-VARYING m_bitmask into VARYING one because of a conflict > and in that case makes sure m_bitmask is actually updated rather than > unmodified, so that later union_ doesn't cause problems. > > I also wonder whether e.g. get_bitmask couldn't have special case for this > and if bm.intersect (m_bitmask); yields unknown_p from something not > originally unknown_p, perhaps chooses to just use get_bitmask_from_range > value and ignore the stored m_bitmask. Though, dunno how union_bitmask > in that case would figure out it needs to update m_bitmask. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk > and 14.3 (after a while)?
LGTM. Thanks for doing a deep dive here. > > 2025-02-27 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/118953 > * value-range.cc (irange::union_bitmask): Update m_bitmask if > get_bitmask () is unknown_p and m_bitmask is not even when the > semantic bitmask didn't change and returning false. > > * gcc.dg/torture/pr118953.c: New test. > > --- gcc/value-range.cc.jj 2025-01-02 11:23:25.118396777 +0100 > +++ gcc/value-range.cc 2025-02-26 18:49:10.713107905 +0100 > @@ -2447,7 +2447,7 @@ irange::union_bitmask (const irange &r) > irange_bitmask bm = get_bitmask (); > irange_bitmask save = bm; > bm.union_ (r.get_bitmask ()); > - if (save == bm) > + if (save == bm && (!bm.unknown_p () || m_bitmask.unknown_p ())) > return false; I wonder if it's better to swap the operands of &&, since the equality operator is more expensive than unknown_p. Either way, it's fine. Aldy > > m_bitmask = bm; > --- gcc/testsuite/gcc.dg/torture/pr118953.c.jj 2025-02-26 > 18:51:47.139936059 +0100 > +++ gcc/testsuite/gcc.dg/torture/pr118953.c 2025-02-26 18:52:33.078298359 > +0100 > @@ -0,0 +1,42 @@ > +/* PR tree-optimization/118953 */ > +/* { dg-do run { target int32plus } } */ > + > +int a, d; > +long long b, c; > + > +int > +foo (int f, int g, unsigned long long h, long long j) > +{ > + unsigned long long i = 0; > + if (g) > + switch (f) > + { > + case 8: > + i = b; > + break; > + case 6: > + i = c; > + break; > + } > + else > + switch (f) > + { > + case 8: > + i = h; > + break; > + case 24: > + case 32: > + i = j; > + break; > + } > + return i; > +} > + > +int > +main () > +{ > + int k = a * (409628 - 28); > + d = foo (k - 1048524, 0, k - 1048487, k - 1048531ULL); > + if (d) > + __builtin_abort (); > +} > > Jakub