On Mon, 21 Aug 2023 at 12:26, Richard Biener <rguent...@suse.de> wrote: > > On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote: > > > On Fri, 18 Aug 2023 at 14:52, Richard Biener <rguent...@suse.de> wrote: > > > > > > On Fri, 18 Aug 2023, Richard Sandiford wrote: > > > > > > > Richard Biener <rguent...@suse.de> writes: > > > > > The following avoids running into somehow flawed logic in > > > > > fold_vec_perm > > > > > for non-VLA vectors. > > > > > > > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > > > > > > > > > Richard. > > > > > > > > > > PR tree-optimization/111048 > > > > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA > > > > > vectors first. > > > > > > > > > > * gcc.dg/torture/pr111048.c: New testcase. > > > > > > > > Please don't do this as a permanent thing. It was a deliberate choice > > > > to have the is_constant be the fallback, so that the "generic" (VLA+VLS) > > > > logic gets more coverage. Like you say, if something is wrong for VLS > > > > then the chances are that it's also wrong for VLA. > > > > > > Sure, feel free to undo this change together with the fix for the > > > VLA case. > > Hi, > > The attached patch reverts the workaround, and fixes the issue. > > Bootstrapped+tested on aarch64-linux-gnu with and without SVE, and > > x64_64-linux-gnu. > > OK to commit ? > > OK. Thanks, committed to trunk in 649388462e9a3c2de0b90ce525de8044704cc521
Thanks, Prathamesh > > > Thanks, > > Prathamesh > > > > > > Richard. > > > > > > > Thanks, > > > > Richard > > > > > > > > > > > > > --- > > > > > gcc/fold-const.cc | 12 ++++++------ > > > > > gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ > > > > > 2 files changed, 30 insertions(+), 6 deletions(-) > > > > > create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > > > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > > > > index 5c51c9d91be..144fd7481b3 100644 > > > > > --- a/gcc/fold-const.cc > > > > > +++ b/gcc/fold-const.cc > > > > > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, > > > > > tree arg1, const vec_perm_indices &sel, > > > > > unsigned res_npatterns, res_nelts_per_pattern; > > > > > unsigned HOST_WIDE_INT res_nelts; > > > > > > > > > > + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > > > + { > > > > > + res_npatterns = res_nelts; > > > > > + res_nelts_per_pattern = 1; > > > > > + } > > > > > /* (1) If SEL is a suitable mask as determined by > > > > > valid_mask_for_fold_vec_perm_cst_p, then: > > > > > res_npatterns = max of npatterns between ARG0, ARG1, and SEL > > > > > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree > > > > > arg1, const vec_perm_indices &sel, > > > > > res_npatterns = nelts in result vector. > > > > > res_nelts_per_pattern = 1. > > > > > This exception is made so that VLS ARG0, ARG1 and SEL work as > > > > > before. */ > > > > > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > > > + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, > > > > > reason)) > > > > > { > > > > > res_npatterns > > > > > = std::max (VECTOR_CST_NPATTERNS (arg0), > > > > > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, > > > > > tree arg1, const vec_perm_indices &sel, > > > > > > > > > > res_nelts = res_npatterns * res_nelts_per_pattern; > > > > > } > > > > > - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > > > - { > > > > > - res_npatterns = res_nelts; > > > > > - res_nelts_per_pattern = 1; > > > > > - } > > > > > else > > > > > return NULL_TREE; > > > > > > > > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > new file mode 100644 > > > > > index 00000000000..475978aae2b > > > > > --- /dev/null > > > > > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > @@ -0,0 +1,24 @@ > > > > > +/* { dg-do run } */ > > > > > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ > > > > > + > > > > > +typedef unsigned char u8; > > > > > + > > > > > +__attribute__((noipa)) > > > > > +static void check(const u8 * v) { > > > > > + if (*v != 15) __builtin_trap(); > > > > > +} > > > > > + > > > > > +__attribute__((noipa)) > > > > > +static void bug(void) { > > > > > + u8 in_lanes[32]; > > > > > + for (unsigned i = 0; i < 32; i += 2) { > > > > > + in_lanes[i + 0] = 0; > > > > > + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); > > > > > + } > > > > > + > > > > > + check(&in_lanes[13]); > > > > > + } > > > > > + > > > > > +int main() { > > > > > + bug(); > > > > > +} > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE Software Solutions Germany GmbH, > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)