Joel Hutton <joel.hut...@arm.com> writes: > Hi all, > > This patch fixes PR99102. For masked gather loads/scatter stores the > 'loop_masks' variable was checked to be non-null, but the 'final_mask' was the > actual mask used. > > bootstrapped and regression tested on aarch64. Regression tested on > aarch64_sve > under qemu. > > [Vect] Fix mask check on Scatter loads/stores > > Previously, IFN_MASK_SCATTER_STORE was used if 'loop_masks' was > non-null, but the mask used is 'final_mask'. This caused a bug where > a 'MASK_STORE' was vectorized into a 'SCATTER_STORE' instead of a > 'MASK_SCATTER_STORE'. This fixes PR target/99102. > > gcc/ChangeLog: > > PR target/99102 > * tree-vect-stmts.c (vectorizable_store): Fix scatter store mask > check condition. > (vectorizable_load): Fix gather load mask check condition. > > gcc/testsuite/ChangeLog: > > PR target/99102 > * gcc.target/aarch64/sve/pr99102.c: New test.
(The filename is out of date, but the git hook would catch that.) > > diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c > b/gcc/testsuite/gcc.dg/vect/pr99102.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..9d321b97a68c05ad08646e8e2d6979c2030c65e6 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c > @@ -0,0 +1,20 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > +/* { dg-additional-options "-msve-vector-bits=256" { target aarch64_sve } } > */ I should have realised this earlier, sorry, but I think we want this to be conditional on aarch64_sve256_hw instead. Same for the scan-tree-dump below. When running the testsuite on SVE hardware, the above would force a VL of 256 even if the actual VL is different, so the test would fail at runtime for 128-bit SVE, 512-bit SVE, etc. Thanks, Richard. > +long a[44]; > +short d, e = -7; > +__attribute__((noipa)) void b(char f, short j, short k, unsigned l) { > + for (int g = 0; g < 9; g += f) > + for (int b = 0; b < 90; b -= k) > + for (int h = 0; h < f; h++) > + for (short i = 0; i < 15; i += 4) > + if (!l) > + a[i] = j; > +} > +int main() { > + for (long c = 0; c < 2; ++c) > + a[c] = 7; > + b(9, d, e, 5); > + if (!a[0]) > + __builtin_abort(); > +} > +/* { dg-final { scan-tree-dump "MASK_SCATTER_STORE" "vect" { target > aarch64_sve } } } */ > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index > 85d3161fe3b2fbe289396457361c7af7519bd2b3..d791d3a47205a6f105e25df5fa22e403d30638eb > 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -8126,7 +8126,7 @@ vectorizable_store (vec_info *vinfo, > { > tree scale = size_int (gs_info.scale); > gcall *call; > - if (loop_masks) > + if (final_mask) > call = gimple_build_call_internal > (IFN_MASK_SCATTER_STORE, 5, dataref_ptr, vec_offset, > scale, vec_oprnd, final_mask); > @@ -9419,7 +9419,7 @@ vectorizable_load (vec_info *vinfo, > tree zero = build_zero_cst (vectype); > tree scale = size_int (gs_info.scale); > gcall *call; > - if (loop_masks) > + if (final_mask) > call = gimple_build_call_internal > (IFN_MASK_GATHER_LOAD, 5, dataref_ptr, > vec_offset, scale, zero, final_mask);