On Tue, May 7, 2024 at 10:58 AM Stefan Schulze Frielinghaus
<stefa...@linux.ibm.com> wrote:
>
> Ping.  Ok for mainline?

OK.

Thanks,
Richard.

> On Thu, Apr 25, 2024 at 09:26:45AM +0200, Stefan Schulze Frielinghaus wrote:
> > Bitcount operations popcount, clz, and ctz are emulated for narrow modes
> > in case an operation is only supported for wider modes.  Beside that ctz
> > may be emulated via clz in expand_ctz.  Reflect this in
> > expression_expensive_p.
> >
> > I considered the emulation of ctz via clz as not expensive since this
> > basically reduces to ctz (x) = c - (clz (x & ~x)) where c is the mode
> > precision minus 1 which should be faster than a loop.
> >
> > Bootstrapped and regtested on x86_64 and s390.  Though, this is probably
> > stage1 material?
> >
> > gcc/ChangeLog:
> >
> >       PR tree-optimization/110490
> >       * tree-scalar-evolution.cc (expression_expensive_p): Also
> >       consider mode widening for popcount, clz, and ctz.
> > ---
> >  gcc/tree-scalar-evolution.cc | 23 +++++++++++++++++++++++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > index b0a5e09a77c..622c7246c1b 100644
> > --- a/gcc/tree-scalar-evolution.cc
> > +++ b/gcc/tree-scalar-evolution.cc
> > @@ -3458,6 +3458,28 @@ bitcount_call:
> >                 && (optab_handler (optab, word_mode)
> >                     != CODE_FOR_nothing))
> >                 break;
> > +           /* If popcount is available for a wider mode, we emulate the
> > +              operation for a narrow mode by first zero-extending the value
> > +              and then computing popcount in the wider mode.  Analogue for
> > +              ctz.  For clz we do the same except that we additionally have
> > +              to subtract the difference of the mode precisions from the
> > +              result.  */
> > +           if (is_a <scalar_int_mode> (mode, &int_mode))
> > +             {
> > +               machine_mode wider_mode_iter;
> > +               FOR_EACH_WIDER_MODE (wider_mode_iter, mode)
> > +                 if (optab_handler (optab, wider_mode_iter)
> > +                     != CODE_FOR_nothing)
> > +                   goto check_call_args;
> > +               /* Operation ctz may be emulated via clz in expand_ctz.  */
> > +               if (optab == ctz_optab)
> > +                 {
> > +                   FOR_EACH_WIDER_MODE_FROM (wider_mode_iter, mode)
> > +                     if (optab_handler (clz_optab, wider_mode_iter)
> > +                         != CODE_FOR_nothing)
> > +                       goto check_call_args;
> > +                 }
> > +             }
> >             return true;
> >           }
> >         break;
> > @@ -3469,6 +3491,7 @@ bitcount_call:
> >         break;
> >       }
> >
> > +check_call_args:
> >        FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
> >       if (expression_expensive_p (arg, cond_overflow_p, cache, op_cost))
> >         return true;
> > --
> > 2.44.0
> >

Reply via email to