On Thu, Oct 26, 2023 at 8:30 PM Andrew Pinski <pins...@gmail.com> wrote:
>
> On Thu, Oct 26, 2023 at 2:29 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pins...@gmail.com> wrote:
> > >
> > > I noticed we were missing optimizing `a / (1 << b)` when
> > > we know that a is nonnegative but only due to ranger information.
> > > This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> > > for SSA_NAME.
> > > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for 
> > > floating
> > > point nor to use the local ranger since I am not 100% sure it is safe 
> > > where
> > > all of the uses tree_expr_nonnegative_p would be safe.
> > >
> > > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> > > global ranges from __builtin_unreachable. It just happened to be optimized
> > > before due to global ranges not being used as much.
> > >
> > > Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > >
> > >         PR tree-optimization/111959
> > >
> > > gcc/ChangeLog:
> > >
> > >         * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> > >         the global range to see if the SSA_NAME was nonnegative.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.dg/tree-ssa/forwprop-42.c: New test.
> > >         * gcc.dg/pr80776-1.c: xfail and update comment.
> > > ---
> > >  gcc/fold-const.cc                           | 36 +++++++++++++++------
> > >  gcc/testsuite/gcc.dg/pr80776-1.c            |  8 ++---
> > >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++
> > >  3 files changed, 46 insertions(+), 13 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > >
> > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > > index 40767736389..2a2a90230f5 100644
> > > --- a/gcc/fold-const.cc
> > > +++ b/gcc/fold-const.cc
> > > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool 
> > > *strict_overflow_p, int depth)
> > >        return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 
> > > 2));
> > >
> > >      case SSA_NAME:
> > > -      /* Limit the depth of recursion to avoid quadratic behavior.
> > > -        This is expected to catch almost all occurrences in practice.
> > > -        If this code misses important cases that unbounded recursion
> > > -        would not, passes that need this information could be revised
> > > -        to provide it through dataflow propagation.  */
> > > -      return (!name_registered_for_update_p (t)
> > > -             && depth < param_max_ssa_name_query_depth
> > > -             && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > > -                                                 strict_overflow_p, 
> > > depth));
> > > +      {
> > > +       /* For integral types, querry the global range if possible. */
> >
> > query
> >
> > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> > > +         {
> > > +           value_range vr;
> > > +           if (get_global_range_query ()->range_of_expr (vr, t)
> > > +               && !vr.varying_p () && !vr.undefined_p ())
> > > +             {
> > > +               /* If the range is nonnegative, return true. */
> > > +               if (vr.nonnegative_p ())
> > > +                 return true;
> > > +
> > > +               /* If the range is non-positive, then return false. */
> > > +               if (vr.nonpositive_p ())
> > > +                 return false;
> >
> > That's testing for <= 0, nonnegative for >= 0.  This means when
> > vr.nonpositive_p () the value could still be zero (and nonnegative),
> > possibly be figured out by the recursion below.
> >
> > Since we don't have negative_p () do we want to test
> > nonpositive_p () && nonzero_p () instead?
>
> I was thinking about that when I was writing the patch.
> If the ranger figured out the value was zero, nonnegative_p would have
> returned true.
> So while yes nonpositive_p() would return true but we already checked
> nonnegative_p beforehand and the nonzero_p would not matter.
> Now the question is if after nonnegative_p we check if the range could
> contain 0 still is that worth the recursion.

Yes, that was the point.

> The whole idea of
> returning false was to remove the need from recursion as much.

Well, specifically the exact zero case is something the code is reasonably
good at.  If we want to remove recursion as much as possible we'd
remove it completely.

Maybe you can do some statistics how many hits the recursion yields
after the vr.nonnegative_p () out (but without the nonpositive_p one)?

Richard.

> Thanks,
> Andrew
>
>
> >
> > OK with that change.
> >
> > Richard.
> >
> > > +             }
> > > +         }
> > > +       /* Limit the depth of recursion to avoid quadratic behavior.
> > > +          This is expected to catch almost all occurrences in practice.
> > > +          If this code misses important cases that unbounded recursion
> > > +          would not, passes that need this information could be revised
> > > +          to provide it through dataflow propagation.  */
> > > +       return (!name_registered_for_update_p (t)
> > > +               && depth < param_max_ssa_name_query_depth
> > > +               && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > > +                                                   strict_overflow_p, 
> > > depth));
> > > +      }
> > >
> > >      default:
> > >        return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE 
> > > (t));
> > > diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c 
> > > b/gcc/testsuite/gcc.dg/pr80776-1.c
> > > index b9bce62d982..f3d47aeda36 100644
> > > --- a/gcc/testsuite/gcc.dg/pr80776-1.c
> > > +++ b/gcc/testsuite/gcc.dg/pr80776-1.c
> > > @@ -18,14 +18,14 @@ Foo (void)
> > >    if (! (0 <= i && i <= 999999))
> > >      __builtin_unreachable ();
> > >
> > > -  /* Legacy evrp sets the range of i to [0, MAX] *before* the first 
> > > conditional,
> > > +  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
> > >       and to [0,999999] *before* the second conditional.  This is because 
> > > both
> > > -     evrp and VRP use trickery to set global ranges when this particular 
> > > use of
> > > +     vrp use trickery to set global ranges when this particular use of
> > >       a __builtin_unreachable is in play (see uses of
> > >       assert_unreachable_fallthru_edge_p).
> > >
> > > -     Setting these ranges at the definition site, causes VRP to remove 
> > > the
> > > +     Setting these ranges at the definition site, causes other passes to 
> > > remove the
> > >       unreachable code altogether, leaving the following sprintf 
> > > unguarded.  This
> > >       causes the bogus warning below.  */
> > > -  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
> > > +  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } 
> > > } */
> > >  }
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c 
> > > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > > new file mode 100644
> > > index 00000000000..4e5421ed4d4
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > > @@ -0,0 +1,15 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > > +/* PR tree-optimization/111959 */
> > > +
> > > +int divbypow2(int a, int b)
> > > +{
> > > +  if (a & ~0xff) __builtin_unreachable();
> > > +  return a / (1<<b);
> > > +}
> > > +
> > > +/* divbypow2 should be able to optimize to just a/b as a is known to be 
> > > always positive. */
> > > +/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-not " << " "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */
> > > +
> > > --
> > > 2.39.3
> > >

Reply via email to