On 3/1/23 11:12, Richard Biener wrote:
On Wed, 1 Mar 2023, Aldy Hernandez wrote:



On 2/28/23 10:41, Richard Biener wrote:
On Mon, 27 Feb 2023, Aldy Hernandez wrote:



On 2/27/23 14:58, Richard Biener wrote:
After fixing PR107561 the following avoids looking at VR_ANTI_RANGE
ranges where it doesn't seem obvious the code does the correct
thing here (lower_bound and upper_bound do not work as expected).

I do realize there's some confusion here, and some of it is my fault. This
has
become obvious in my upcoming work removing all of legacy.

What's going on is that ultimately min/max are broken with (non-legacy)
iranges.  Or at the very least inconsistent between legacy and non-legacy.
These are left over from the legacy world, and have been marked DEPRECATED
for
a few releases, but the middle end warnings continued to use them and even
added new uses after they were obsoleted.

min/max have different meanings depending on kind(), which is also
deprecated,
btw.  They are the underlying min/max fields from the legacy
implementation,
and thus somewhat leak the implementation details. Unfortunately, they are
being called from non-legacy code which is ignoring the kind() field.

In retrospect I should've converted everything away from min/max/kind years
ago, or at the very least converted min/max to work with non-legacy more
consistently.

For the record:

    enum value_range_kind kind () const;                // DEPRECATED
    tree min () const;                          // DEPRECATED
    tree max () const;                          // DEPRECATED
    bool symbolic_p () const;                   // DEPRECATED
    bool constant_p () const;                   // DEPRECATED
    void normalize_symbolics ();                        // DEPRECATED
    void normalize_addresses ();                        // DEPRECATED
    bool may_contain_p (tree) const;            // DEPRECATED
    bool legacy_verbose_union_ (const class irange *);  // DEPRECATED
    bool legacy_verbose_intersect (const irange *);     // DEPRECATED

In my local branch I tried converting all the middle-end legacy API uses to
the new API, but a bunch of tests started failing, and lots more false
positives started showing up in correct code.  I suspect that's part of the
reason legacy continued to be used in these passes :-/.  As you point out
in
the PR, the tests seem designed to test the current (at times broken)
implementation.

That being said, the 5 fixes in your patch are not wrong to begin with,
because all uses guard lower_bound() and upper_bound() which work
correctly.
They return the lowest and uppermost possible bound for the range (ignoring
the underlying implementation).  So, the lower bound of a signed non-zero
is
-INF because ~[0,0] is really [-INF,-1][1,+INF]. In the min/max legacy
code,
min() of signed non-zero (~[0,0]) is 0.  The new implementation has no
concept
of anti-ranges, and we don't leak any of that out.

Any uses of min/max without looking at kind() are definitely broken. OTOH
uses
of lower/upper_bound are fine and should work with both legacy and
non-legacy.

Unrelated, but one place where I haven't been able to convince myself that
the
use is correct is bounds_of_var_in_loop:

/* Check if init + nit * step overflows.  Though we checked
         scev {init, step}_loop doesn't wrap, it is not enough
         because the loop may exit immediately.  Overflow could
         happen in the plus expression in this case.  */
      if ((dir == EV_DIR_DECREASES
           && compare_values (maxvr.min (), initvr.min ()) != -1)
          || (dir == EV_DIR_GROWS
       && compare_values (maxvr.max (), initvr.max ()) != 1))

Ughh...this is all slated to go away, and I have patches removing all of
legacy and the old API.

Does this help?  Do you still think lower and upper bound are not working
as
expected?

lower_bound/upper_bound work as expected,
tree_lower_bound/tree_upper_bound do not.  I've checked and all uses
I "fixed" use lower_bound/upper_boud.

tree_lower_bound & friends are 'protected', but in the above light the
comment

    // potential promotion to public?

looks dangerous (I was considering using them ...).

First of all, my apologies for the time you've spent on this.  This needed
cleaning up a few releases ago, but with legacy in place, it kept getting
pushed further away.  This is my first order of business once stage1 opens.
That being said, thank you for spot checking this.

It looks like tree_lower_bound was just syntactic sugar for m_base[pair * 2].
The comment is likely wrong.  It should've stayed protected, or nuked in favor
of accessing m_base directly to avoid confusion.


I'm dropping the patch, it's probably time to work on getting rid of
'value_range' uses (aka legacy_mode_p ()) and/or replace raw accesses
to the min/max for that mode with m_base accesses?  At least
that there's tree_{upper,lower}_bound for pair != 0 suggests this
function is used with differing semantics internally which is a

My local tree has a few dozen patches removing legacy and converting the trees
to wide_int.  In my work, legacy_mode_p, min, max, kind, tree_lower_bound, etc
are all gone.  I can make the branch public if you like.

source of confusion (to me at least).  tree_{lower,upper}_bound
should be able to assert it's not a legacy mode range and
min/max should be able to assert that it is.

Agreed.  One possible problem *may be* vr-values.cc which still uses min() and
max() throughout can be called on legacy as well as non-legacy.

This unfortunately means that since kind/min/max will return different results
for legacy/non-legacy (kind() never returns VR_ANTI_RANGE for non-legacy),
simplify_using_ranges can theoretically give different results depending on
how it's called.  In practice, only DOM/VRP call simplify_using_ranges and
always with non-legacy.  However, simplify_using_ranges uses get_value_range()
internally which uses legacy.  Sorry, I just realized this now while auditing
the code.

I honestly have no idea if this matters in practice, and since this has been
this way for a few releases, I'm afraid of touching it so late in the cycle.
But I'm happy to help if you think it must be fixed in this release.

Hmmmm, it looks like there are just a few non-legacy uses in vr-values.cc
(int_range<2> and int_range_max), and never in code paths checking
min/max/kind.  So your suggestions about the asserts may be all we need,
dunno.


Btw, legacy_upper_bound looks wrong:

    if (m_kind == VR_ANTI_RANGE)
      {
        tree typ = type (), t;
        if (pair == 1 || vrp_val_is_min (min ()))
          t = vrp_val_max (typ);
        else
          t = wide_int_to_tree (typ, wi::to_wide (min ()) - 1);

not sure what the pair == 1 case is about, but the upper bound
of an anti-range is only special when max() is vrp_val_is_max,
aka ~[low, +INF] where it then is low - 1, but the above checks
for ~[-INF, high] and returns +INF in that case.  That's correct
unless high == +INF at least, but the else case is wrong for
all high != +INF?

An anti-range is really two pairs.  So, ~[6,9] is really:

[-MIN, 5][10, MAX]

Pair 0 is [-MIN, 5] and pair 1 is [10, MAX].  This means that the upper bound
for the pair==1 case is just MAX.


I think it should be

        if (pair == 1 || !vrp_val_is_max (max ()))
          t = vrp_val_max (typ);
        else
          t = wide_int_to_tree (typ, wi::to_wide (min ()) - 1);

no?

I think both of us are wrong.  The original vrp_val_is_min(min()) call will
always return false.  And your !vrp_val_is_max(max()) is always true.  This is
because irange::set() will normalize anti-ranges into a VR_RANGE when
possible, in order to avoid multiple representations for the same range:

       else if (is_min)
         {
          tree one = build_int_cst (TREE_TYPE (max), 1);
          min = int_const_binop (PLUS_EXPR, max, one);
          max = vrp_val_max (TREE_TYPE (max));
          kind = VR_RANGE;
         }
       else if (is_max)
         {
          tree one = build_int_cst (TREE_TYPE (min), 1);
          max = int_const_binop (MINUS_EXPR, min, one);
          min = vrp_val_min (TREE_TYPE (min));
          kind = VR_RANGE;
         }

So ~[-MIN,5] is represented as [6,MAX] and ~[5,+MAX] as [-MIN,4].

Ughhh... so it looks like legacy_upper_bound is correct, but the
vrp_val_is_min() check is useless.

Am I missing something?

Well, you are missing that VR_ANTI_RANGE only has a single pair in
m_base[], so it's _not_ represented as [-MIN, 5][10, MAX].  So you
say that for VR_ANTI_RANGE we should never ask for legacy_upper_bound (0)
but always legaacy_upper_bound (1)?

That is actually on purpose, so lower_bound() and upper_bound() work correctly from both legacy and non-legacy. The user shouldn't care whether the range is internally implemented with VR_ANTI_RANGE or not. The ultimate goal is to only use lower_bound and upper_bound. Even kind() will be gone from the API.

Imagine ~[6,9] which to the user is:

[-MIN, 5][10, MAX]

legacy_upper_bound(0) is 5
legacy_upper_bound(1) is MAX

Think of it in terms of what the user, with no implementation details, would understand as a range.


Btw, I didn't see in ::verify where we verify proper normalization
so after some range ops it might be in non normalized form.

range-ops uses the irange API, so ranges coming out of it should be canonicalized. If they're not. That's a bug and should be fixed.


That said - as this is all going away tampering with it looks less
useful than ripping it out ...

I can share my work if you like. But yes, it's all going away very soon after the start of stage1. I'm sitting on 35 small incremental patches removing legacy, etc. The only hold out are a handful of uses in the middle-end warnings for which I have provided get_legacy_range() which returns kind/min/max as legacy currently does. This is because the middle end warnings break when "fixed" correctly, and honestly I don't understand them to fix them properly.


Anyway, see the series I posted which I ended up with in understanding
the code.  Appearantly I misunderstood the intent of the current state,
mainly because of lack of documentation I guess :/

Sorry, I'm not remotely as good as you at reviewing patches. I'm on it, but it's slow going ;-).

That being said, I'll go on the record saying that I'm on documentation writing duty later this year. It's long overdue.


I do wonder if s/value_range/int_range<2>/ throughout most code is
the correct thing to do and what the actual hurdles are in the end.

Indeed. I've also done that in my branch, while fixing a few warts. For that matter, with legacy gone, we can repurpose int_range<1> for a simple, two-end range (less memory too).

Aldy

Reply via email to