Joel Hutton <joel.hut...@arm.com> writes:
> Hi all,
>
> This patch fixes PR99102. For masked gather loads/scatter stores the
> 'loop_masks' variable was checked to be non-null, but the 'final_mask' was the
> actual mask used.
>
> bootstrapped and regression tested on aarch64. Regression tested on 
> aarch64_sve
> under qemu.
>
>  [Vect] Fix mask check on Scatter loads/stores
>
> Previously, IFN_MASK_SCATTER_STORE was used if 'loop_masks' was
> non-null, but the mask used is 'final_mask'. This caused a bug where
> a 'MASK_STORE' was vectorized into a 'SCATTER_STORE' instead of a
> 'MASK_SCATTER_STORE'. This fixes PR target/99102.
>
> gcc/ChangeLog:
>
>       PR target/99102
>       * tree-vect-stmts.c (vectorizable_store): Fix scatter store mask
>       check condition.
>       (vectorizable_load): Fix gather load mask check condition.
>
> gcc/testsuite/ChangeLog:
>
>       PR target/99102
>       * gcc.target/aarch64/sve/pr99102.c: New test.

(The filename is out of date, but the git hook would catch that.)

>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
> b/gcc/testsuite/gcc.dg/vect/pr99102.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..9d321b97a68c05ad08646e8e2d6979c2030c65e6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> @@ -0,0 +1,20 @@
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +/* { dg-additional-options "-msve-vector-bits=256" { target aarch64_sve } } 
> */

I should have realised this earlier, sorry, but I think we want
this to be conditional on aarch64_sve256_hw instead.  Same for the
scan-tree-dump below.

When running the testsuite on SVE hardware, the above would force
a VL of 256 even if the actual VL is different, so the test would
fail at runtime for 128-bit SVE, 512-bit SVE, etc.

Thanks,
Richard.

> +long a[44];
> +short d, e = -7;
> +__attribute__((noipa)) void b(char f, short j, short k, unsigned l) {
> +  for (int g = 0; g < 9; g += f)
> +    for (int b = 0; b < 90; b -= k)
> +      for (int h = 0; h < f; h++)
> +        for (short i = 0; i < 15; i += 4)
> +          if (!l)
> +            a[i] = j;
> +}
> +int main() {
> +  for (long c = 0; c < 2; ++c)
> +    a[c] = 7;
> +  b(9, d, e, 5);
> +  if (!a[0])
> +    __builtin_abort();
> +}
> +/* { dg-final { scan-tree-dump "MASK_SCATTER_STORE" "vect"  { target 
> aarch64_sve } } } */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 
> 85d3161fe3b2fbe289396457361c7af7519bd2b3..d791d3a47205a6f105e25df5fa22e403d30638eb
>  100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -8126,7 +8126,7 @@ vectorizable_store (vec_info *vinfo,
>               {
>                 tree scale = size_int (gs_info.scale);
>                 gcall *call;
> -               if (loop_masks)
> +               if (final_mask)
>                   call = gimple_build_call_internal
>                     (IFN_MASK_SCATTER_STORE, 5, dataref_ptr, vec_offset,
>                      scale, vec_oprnd, final_mask);
> @@ -9419,7 +9419,7 @@ vectorizable_load (vec_info *vinfo,
>                       tree zero = build_zero_cst (vectype);
>                       tree scale = size_int (gs_info.scale);
>                       gcall *call;
> -                     if (loop_masks)
> +                     if (final_mask)
>                         call = gimple_build_call_internal
>                           (IFN_MASK_GATHER_LOAD, 5, dataref_ptr,
>                            vec_offset, scale, zero, final_mask);

Reply via email to