On Tue, May 7, 2024 at 10:58 AM Stefan Schulze Frielinghaus <stefa...@linux.ibm.com> wrote: > > Ping. Ok for mainline?
OK. Thanks, Richard. > On Thu, Apr 25, 2024 at 09:26:45AM +0200, Stefan Schulze Frielinghaus wrote: > > Bitcount operations popcount, clz, and ctz are emulated for narrow modes > > in case an operation is only supported for wider modes. Beside that ctz > > may be emulated via clz in expand_ctz. Reflect this in > > expression_expensive_p. > > > > I considered the emulation of ctz via clz as not expensive since this > > basically reduces to ctz (x) = c - (clz (x & ~x)) where c is the mode > > precision minus 1 which should be faster than a loop. > > > > Bootstrapped and regtested on x86_64 and s390. Though, this is probably > > stage1 material? > > > > gcc/ChangeLog: > > > > PR tree-optimization/110490 > > * tree-scalar-evolution.cc (expression_expensive_p): Also > > consider mode widening for popcount, clz, and ctz. > > --- > > gcc/tree-scalar-evolution.cc | 23 +++++++++++++++++++++++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc > > index b0a5e09a77c..622c7246c1b 100644 > > --- a/gcc/tree-scalar-evolution.cc > > +++ b/gcc/tree-scalar-evolution.cc > > @@ -3458,6 +3458,28 @@ bitcount_call: > > && (optab_handler (optab, word_mode) > > != CODE_FOR_nothing)) > > break; > > + /* If popcount is available for a wider mode, we emulate the > > + operation for a narrow mode by first zero-extending the value > > + and then computing popcount in the wider mode. Analogue for > > + ctz. For clz we do the same except that we additionally have > > + to subtract the difference of the mode precisions from the > > + result. */ > > + if (is_a <scalar_int_mode> (mode, &int_mode)) > > + { > > + machine_mode wider_mode_iter; > > + FOR_EACH_WIDER_MODE (wider_mode_iter, mode) > > + if (optab_handler (optab, wider_mode_iter) > > + != CODE_FOR_nothing) > > + goto check_call_args; > > + /* Operation ctz may be emulated via clz in expand_ctz. */ > > + if (optab == ctz_optab) > > + { > > + FOR_EACH_WIDER_MODE_FROM (wider_mode_iter, mode) > > + if (optab_handler (clz_optab, wider_mode_iter) > > + != CODE_FOR_nothing) > > + goto check_call_args; > > + } > > + } > > return true; > > } > > break; > > @@ -3469,6 +3491,7 @@ bitcount_call: > > break; > > } > > > > +check_call_args: > > FOR_EACH_CALL_EXPR_ARG (arg, iter, expr) > > if (expression_expensive_p (arg, cond_overflow_p, cache, op_cost)) > > return true; > > -- > > 2.44.0 > >