Richard Biener <rguent...@suse.de> writes: > On Fri, 18 Oct 2024, Richard Sandiford wrote: > >> This patch extends get_nonzero_bits to handle POLY_INT_CSTs, >> The easiest (but also most useful) case is that the number >> of trailing zeros in the runtime value is at least the number >> of trailing zeros in each individual component. >> >> In principle, we could do this for coeffs 1 and above only, >> and then OR in ceoff 0. This would give ~0x11 for [14, 32], say. >> But that's future work. >> >> gcc/ >> * tree-ssanames.cc (get_nonzero_bits): Handle POLY_INT_CSTs. >> * match.pd (with_possible_nonzero_bits): Likewise. >> >> gcc/testsuite/ >> * gcc.target/aarch64/sve/cnt_fold_4.c: New test. >> --- >> gcc/match.pd | 2 + >> .../gcc.target/aarch64/sve/cnt_fold_4.c | 61 +++++++++++++++++++ >> gcc/tree-ssanames.cc | 3 + >> 3 files changed, 66 insertions(+) >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 540582dc984..41903554478 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -2893,6 +2893,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> possibly set. */ >> (match with_possible_nonzero_bits >> INTEGER_CST@0) >> +(match with_possible_nonzero_bits >> + POLY_INT_CST@0) >> (match with_possible_nonzero_bits >> SSA_NAME@0 >> (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0))))) >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c >> b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c >> new file mode 100644 >> index 00000000000..b7a53701993 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c >> @@ -0,0 +1,61 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> +/* { dg-final { check-function-bodies "**" "" } } */ >> + >> +#include <arm_sve.h> >> + >> +/* >> +** f1: >> +** cnth x0 >> +** ret >> +*/ >> +uint64_t >> +f1 () >> +{ >> + uint64_t x = svcntw (); >> + x >>= 2; >> + return x << 3; >> +} >> + >> +/* >> +** f2: >> +** [^\n]+ >> +** [^\n]+ >> +** ... >> +** ret >> +*/ >> +uint64_t >> +f2 () >> +{ >> + uint64_t x = svcntd (); >> + x >>= 2; >> + return x << 3; >> +} >> + >> +/* >> +** f3: >> +** cntb x0, all, mul #4 >> +** ret >> +*/ >> +uint64_t >> +f3 () >> +{ >> + uint64_t x = svcntd (); >> + x >>= 1; >> + return x << 6; >> +} >> + >> +/* >> +** f4: >> +** [^\n]+ >> +** [^\n]+ >> +** ... >> +** ret >> +*/ >> +uint64_t >> +f4 () >> +{ >> + uint64_t x = svcntd (); >> + x >>= 2; >> + return x << 2; >> +} >> diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc >> index 4f83fcbb517..d2d1ec18797 100644 >> --- a/gcc/tree-ssanames.cc >> +++ b/gcc/tree-ssanames.cc >> @@ -505,6 +505,9 @@ get_nonzero_bits (const_tree name) >> /* Use element_precision instead of TYPE_PRECISION so complex and >> vector types get a non-zero precision. */ >> unsigned int precision = element_precision (TREE_TYPE (name)); >> + if (POLY_INT_CST_P (name)) >> + return -known_alignment (wi::to_poly_wide (name)); >> + > > Since you don't need precision can you move this right after the > INTEGER_CST handling?
Oops, yes. An earlier cut did use the precision, but I forgot to move it to a more sensible place when changing it. Thanks for the reviews. I've pushed parts 1, 2, and 4-9 with the changes suggested. Part 3 needs more work, so I'll do that separately. Richard