> -----Original Message-----
> From: Srinath Parvathaneni <srinath.parvathan...@arm.com>
> Sent: 07 October 2020 07:14
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> Subject: [PATCH][GCC] arm: Fix wrong code generated for mve scatter store
> with writeback intrinsics with -O2 (PR97271).
> 
> Hello,
> 
> This patch fixes (PR97271) the wrong code-gen for mve scatter store with
> writeback intrinsics with -O2.
> 
> $cat bug.c
> #include "arm_mve.h"
> void
> foo (uint32x4_t * addr, const int offset, int32x4_t value)
> {
>   vstrwq_scatter_base_wb_s32 (addr, 8, value);
> }
> 
> $ arm-none-eabi-gcc  bug.c -S -O2 -march=armv8.1-m.main+mve -mfloat-
> abi=hard -o -
> Without this patch:
> ...
> foo:
>       vldrw.32        q3, [r0]
>       vstrw.u32       q0, [q3, #8]!  ---> (A)
>       vldr.64 d4, .L3
>       vldr.64 d5, .L3+8
>       vldrw.32        q3, [r0]
>       vstrw.u32       q2, [q3, #8]!  ---> (B)
>       bx      lr
> ...
> 
> With this patch:
> ...
> foo:
>       vldrw.32        q3, [r0]
>       vstrw.u32       q0, [q3, #8]!  --> (C)
>       vstrw.32        q3, [r0]
>       bx      lr
> ...
> 
> Without this patch 2 vstrw assembly instructions (A and B) are generated for
> vstrwq_scatter_base_wb_s32
> intrinsic where as fix generates only one vstrw assembly instruction (C).
> 
> Bootstrapped on arm-none-linux-gnueabihf and regression tested on arm-
> none-eabi and found no regressions.
> 
> Ok for master? Ok for GCC-10 branch?

Ok for both.
Thanks,
Kyrill

> 
> Regards,
> Srinath.
> 
> gcc/ChangeLog:
> 
> 2020-10-06  Srinath Parvathaneni  <srinath.parvathan...@arm.com>
> 
>       PR target/97291
>       * config/arm/arm-builtins.c (arm_strsbwbs_qualifiers): Modify array.
>       (arm_strsbwbu_qualifiers): Likewise.
>       (arm_strsbwbs_p_qualifiers): Likewise.
>       (arm_strsbwbu_p_qualifiers): Likewise.
>       * config/arm/arm_mve.h (__arm_vstrdq_scatter_base_wb_s64):
> Modify
>       function definition.
>       (__arm_vstrdq_scatter_base_wb_u64): Likewise.
>       (__arm_vstrdq_scatter_base_wb_p_s64): Likewise.
>       (__arm_vstrdq_scatter_base_wb_p_u64): Likewise.
>       (__arm_vstrwq_scatter_base_wb_p_s32): Likewise.
>       (__arm_vstrwq_scatter_base_wb_p_u32): Likewise.
>       (__arm_vstrwq_scatter_base_wb_s32): Likewise.
>       (__arm_vstrwq_scatter_base_wb_u32): Likewise.
>       (__arm_vstrwq_scatter_base_wb_f32): Likewise.
>       (__arm_vstrwq_scatter_base_wb_p_f32): Likewise.
>       * config/arm/arm_mve_builtins.def
> (vstrwq_scatter_base_wb_add_u): Remove
>       expansion for the builtin.
>       (vstrwq_scatter_base_wb_add_s): Likewise.
>       (vstrwq_scatter_base_wb_add_f): Likewise.
>       (vstrdq_scatter_base_wb_add_u): Likewise.
>       (vstrdq_scatter_base_wb_add_s): Likewise.
>       (vstrwq_scatter_base_wb_p_add_u): Likewise.
>       (vstrwq_scatter_base_wb_p_add_s): Likewise.
>       (vstrwq_scatter_base_wb_p_add_f): Likewise.
>       (vstrdq_scatter_base_wb_p_add_u): Likewise.
>       (vstrdq_scatter_base_wb_p_add_s): Likewise.
>       * config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si):
> Remove
>       expand.
>       (mve_vstrwq_scatter_base_wb_add_<supf>v4si): Likewise.
>       (mve_vstrwq_scatter_base_wb_<supf>v4si_insn): Rename pattern
> to ...
>       (mve_vstrwq_scatter_base_wb_<supf>v4si): This.
>       (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Remove expand.
>       (mve_vstrwq_scatter_base_wb_p_add_<supf>v4si): Likewise.
>       (mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn): Rename pattern
> to ...
>       (mve_vstrwq_scatter_base_wb_p_<supf>v4si): This.
>       (mve_vstrwq_scatter_base_wb_fv4sf): Remove expand.
>       (mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise.
>       (mve_vstrwq_scatter_base_wb_fv4sf_insn): Rename pattern to ...
>       (mve_vstrwq_scatter_base_wb_fv4sf): This.
>       (mve_vstrwq_scatter_base_wb_p_fv4sf): Remove expand.
>       (mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise.
>       (mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Rename pattern to ...
>       (mve_vstrwq_scatter_base_wb_p_fv4sf): This.
>       (mve_vstrdq_scatter_base_wb_<supf>v2di): Remove expand.
>       (mve_vstrdq_scatter_base_wb_add_<supf>v2di): Likewise.
>       (mve_vstrdq_scatter_base_wb_<supf>v2di_insn): Rename pattern
> to ...
>       (mve_vstrdq_scatter_base_wb_<supf>v2di): This.
>       (mve_vstrdq_scatter_base_wb_p_<supf>v2di): Remove expand.
>       (mve_vstrdq_scatter_base_wb_p_add_<supf>v2di): Likewise.
>       (mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn): Rename pattern
> to ...
>       (mve_vstrdq_scatter_base_wb_p_<supf>v2di): This.
> 
> gcc/testsuite/ChangeLog:
> 
>       PR target/97291
>       * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
> Modify.
>       * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
>       Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
> Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
> Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
> Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
>       Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
>       Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
>       Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
> Likewise.
>       * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
> Likewise.
> 
> 
> ###############     Attachment also inlined for ease of reply
> ###############
> 
> 
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index
> 33e8015b1405033180adc9334bfec8583193777f..db505a4cbf9d19155a4dde
> cb40877b5cc7ee95e6 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -811,23 +811,23 @@
> arm_ldrgbwbu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> 
>  static enum arm_type_qualifiers
>  arm_strsbwbs_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> -  = { qualifier_void, qualifier_unsigned, qualifier_const, qualifier_none};
> +  = { qualifier_unsigned, qualifier_unsigned, qualifier_const, 
> qualifier_none};
>  #define STRSBWBS_QUALIFIERS (arm_strsbwbs_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_strsbwbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> -  = { qualifier_void, qualifier_unsigned, qualifier_const, 
> qualifier_unsigned};
> +  = { qualifier_unsigned, qualifier_unsigned, qualifier_const,
> qualifier_unsigned};
>  #define STRSBWBU_QUALIFIERS (arm_strsbwbu_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_strsbwbs_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> -  = { qualifier_void, qualifier_unsigned, qualifier_const,
> +  = { qualifier_unsigned, qualifier_unsigned, qualifier_const,
>        qualifier_none, qualifier_unsigned};
>  #define STRSBWBS_P_QUALIFIERS (arm_strsbwbs_p_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_strsbwbu_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> -  = { qualifier_void, qualifier_unsigned, qualifier_const,
> +  = { qualifier_unsigned, qualifier_unsigned, qualifier_const,
>        qualifier_unsigned, qualifier_unsigned};
>  #define STRSBWBU_P_QUALIFIERS (arm_strsbwbu_p_qualifiers)
> 
> diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
> index
> 99cff41cccbe22f5f6bfe8db513092830885976c..39cac6fd0352a599075ad9eb
> 1809a4ef18635037 100644
> --- a/gcc/config/arm/arm_mve.h
> +++ b/gcc/config/arm/arm_mve.h
> @@ -13993,64 +13993,56 @@ __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrdq_scatter_base_wb_s64 (uint64x2_t * __addr, const int
> __offset, int64x2_t __value)
>  {
> -  __builtin_mve_vstrdq_scatter_base_wb_sv2di (*__addr, __offset, __value);
> -  __builtin_mve_vstrdq_scatter_base_wb_add_sv2di (*__addr, __offset,
> *__addr);
> +  *__addr = __builtin_mve_vstrdq_scatter_base_wb_sv2di (*__addr,
> __offset, __value);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrdq_scatter_base_wb_u64 (uint64x2_t * __addr, const int
> __offset, uint64x2_t __value)
>  {
> -  __builtin_mve_vstrdq_scatter_base_wb_uv2di (*__addr, __offset,
> __value);
> -  __builtin_mve_vstrdq_scatter_base_wb_add_uv2di (*__addr, __offset,
> *__addr);
> +  *__addr = __builtin_mve_vstrdq_scatter_base_wb_uv2di (*__addr,
> __offset, __value);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrdq_scatter_base_wb_p_s64 (uint64x2_t * __addr, const int
> __offset, int64x2_t __value, mve_pred16_t __p)
>  {
> -  __builtin_mve_vstrdq_scatter_base_wb_p_sv2di (*__addr, __offset,
> __value, __p);
> -  __builtin_mve_vstrdq_scatter_base_wb_p_add_sv2di (*__addr, __offset,
> *__addr, __p);
> + *__addr =  __builtin_mve_vstrdq_scatter_base_wb_p_sv2di (*__addr,
> __offset, __value, __p);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrdq_scatter_base_wb_p_u64 (uint64x2_t * __addr, const int
> __offset, uint64x2_t __value, mve_pred16_t __p)
>  {
> -  __builtin_mve_vstrdq_scatter_base_wb_p_uv2di (*__addr, __offset,
> __value, __p);
> -  __builtin_mve_vstrdq_scatter_base_wb_p_add_uv2di (*__addr, __offset,
> *__addr, __p);
> +  *__addr = __builtin_mve_vstrdq_scatter_base_wb_p_uv2di (*__addr,
> __offset, __value, __p);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrwq_scatter_base_wb_p_s32 (uint32x4_t * __addr, const int
> __offset, int32x4_t __value, mve_pred16_t __p)
>  {
> -  __builtin_mve_vstrwq_scatter_base_wb_p_sv4si (*__addr, __offset,
> __value, __p);
> -  __builtin_mve_vstrwq_scatter_base_wb_p_add_sv4si (*__addr, __offset,
> *__addr, __p);
> +  *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_sv4si (*__addr,
> __offset, __value, __p);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrwq_scatter_base_wb_p_u32 (uint32x4_t * __addr, const int
> __offset, uint32x4_t __value, mve_pred16_t __p)
>  {
> -  __builtin_mve_vstrwq_scatter_base_wb_p_uv4si (*__addr, __offset,
> __value, __p);
> -  __builtin_mve_vstrwq_scatter_base_wb_p_add_uv4si (*__addr, __offset,
> *__addr, __p);
> +  *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_uv4si (*__addr,
> __offset, __value, __p);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrwq_scatter_base_wb_s32 (uint32x4_t * __addr, const int
> __offset, int32x4_t __value)
>  {
> -  __builtin_mve_vstrwq_scatter_base_wb_sv4si (*__addr, __offset,
> __value);
> -  __builtin_mve_vstrwq_scatter_base_wb_add_sv4si (*__addr, __offset,
> *__addr);
> +  *__addr = __builtin_mve_vstrwq_scatter_base_wb_sv4si (*__addr,
> __offset, __value);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int
> __offset, uint32x4_t __value)
>  {
> -  __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset,
> __value);
> -  __builtin_mve_vstrwq_scatter_base_wb_add_uv4si (*__addr, __offset,
> *__addr);
> +  *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr,
> __offset, __value);
>  }
> 
>  __extension__ extern __inline uint8x16_t
> @@ -19158,16 +19150,14 @@ __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrwq_scatter_base_wb_f32 (uint32x4_t * __addr, const int
> __offset, float32x4_t __value)
>  {
> -  __builtin_mve_vstrwq_scatter_base_wb_fv4sf (*__addr, __offset,
> __value);
> -  __builtin_mve_vstrwq_scatter_base_wb_add_fv4sf (*__addr, __offset,
> *__addr);
> +  *__addr = __builtin_mve_vstrwq_scatter_base_wb_fv4sf (*__addr,
> __offset, __value);
>  }
> 
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int
> __offset, float32x4_t __value, mve_pred16_t __p)
>  {
> -  __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset,
> __value, __p);
> -  __builtin_mve_vstrwq_scatter_base_wb_p_add_fv4sf (*__addr, __offset,
> *__addr, __p);
> +  *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr,
> __offset, __value, __p);
>  }
> 
>  __extension__ extern __inline float16x8_t
> diff --git a/gcc/config/arm/arm_mve_builtins.def
> b/gcc/config/arm/arm_mve_builtins.def
> index
> 753e40a951d071c1ab77476a1cc4779e91689178..55d426fbd14da6a536209f
> 04f7f38de21c68b720 100644
> --- a/gcc/config/arm/arm_mve_builtins.def
> +++ b/gcc/config/arm/arm_mve_builtins.def
> @@ -828,19 +828,9 @@ VAR3
> (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE, vidupq_m_n_u, v16qi,
> v8hi, v4si)
>  VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_n_u, v16qi, v4si,
> v8hi)
>  VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_n_u, v16qi, v4si,
> v8hi)
>  VAR1 (STRSBWBU, vstrwq_scatter_base_wb_u, v4si)
> -VAR1 (STRSBWBU, vstrwq_scatter_base_wb_add_u, v4si)
> -VAR1 (STRSBWBU, vstrwq_scatter_base_wb_add_s, v4si)
> -VAR1 (STRSBWBU, vstrwq_scatter_base_wb_add_f, v4sf)
>  VAR1 (STRSBWBU, vstrdq_scatter_base_wb_u, v2di)
> -VAR1 (STRSBWBU, vstrdq_scatter_base_wb_add_u, v2di)
> -VAR1 (STRSBWBU, vstrdq_scatter_base_wb_add_s, v2di)
>  VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_u, v4si)
> -VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_add_u, v4si)
> -VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_add_s, v4si)
> -VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_add_f, v4sf)
>  VAR1 (STRSBWBU_P, vstrdq_scatter_base_wb_p_u, v2di)
> -VAR1 (STRSBWBU_P, vstrdq_scatter_base_wb_p_add_u, v2di)
> -VAR1 (STRSBWBU_P, vstrdq_scatter_base_wb_p_add_s, v2di)
>  VAR1 (STRSBWBS, vstrwq_scatter_base_wb_s, v4si)
>  VAR1 (STRSBWBS, vstrwq_scatter_base_wb_f, v4sf)
>  VAR1 (STRSBWBS, vstrdq_scatter_base_wb_s, v2di)
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index
> 465b39a51b3a258295ed764f0e742932e5d59225..dea479b04cbe0d63878ac7
> 4f321bcd5f9263175c 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -10315,38 +10315,10 @@
>    [(set_attr "type" "mve_move")
>     (set_attr "length""8")])
> 
> -(define_expand "mve_vstrwq_scatter_base_wb_<supf>v4si"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SI 2 "s_register_operand" "w")
> -   (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_wb = gen_reg_rtx (V4SImode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_<supf>v4si_insn (ignore_wb,
> operands[0],
> -                                               operands[1], operands[2]));
> -  DONE;
> -})
> -
> -(define_expand "mve_vstrwq_scatter_base_wb_add_<supf>v4si"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SI 2 "s_register_operand" "0")
> -   (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_vec = gen_reg_rtx (V4SImode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_<supf>v4si_insn (operands[0],
> operands[2],
> -                                               operands[1], ignore_vec));
> -  DONE;
> -})
> -
>  ;;
> -;; [vstrwq_scatter_base_wb_s vstrdq_scatter_base_wb_u]
> +;; [vstrwq_scatter_base_wb_s vstrwq_scatter_base_wb_u]
>  ;;
> -(define_insn "mve_vstrwq_scatter_base_wb_<supf>v4si_insn"
> +(define_insn "mve_vstrwq_scatter_base_wb_<supf>v4si"
>    [(set (mem:BLK (scratch))
>       (unspec:BLK
>               [(match_operand:V4SI 1 "s_register_operand" "0")
> @@ -10368,42 +10340,10 @@
>  }
>    [(set_attr "length" "4")])
> 
> -(define_expand "mve_vstrwq_scatter_base_wb_p_<supf>v4si"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SI 2 "s_register_operand" "w")
> -   (match_operand:HI 3 "vpr_register_operand")
> -   (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_wb = gen_reg_rtx (V4SImode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn (ignore_wb,
> operands[0],
> -                                                 operands[1], operands[2],
> -                                                 operands[3]));
> -  DONE;
> -})
> -
> -(define_expand "mve_vstrwq_scatter_base_wb_p_add_<supf>v4si"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SI 2 "s_register_operand" "0")
> -   (match_operand:HI 3 "vpr_register_operand")
> -   (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_vec = gen_reg_rtx (V4SImode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn (operands[0],
> operands[2],
> -                                                 operands[1], ignore_vec,
> -                                                 operands[3]));
> -  DONE;
> -})
> -
>  ;;
>  ;; [vstrwq_scatter_base_wb_p_s vstrwq_scatter_base_wb_p_u]
>  ;;
> -(define_insn "mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn"
> +(define_insn "mve_vstrwq_scatter_base_wb_p_<supf>v4si"
>   [(set (mem:BLK (scratch))
>         (unspec:BLK
>               [(match_operand:V4SI 1 "s_register_operand" "0")
> @@ -10426,38 +10366,10 @@
>  }
>    [(set_attr "length" "8")])
> 
> -(define_expand "mve_vstrwq_scatter_base_wb_fv4sf"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SF 2 "s_register_operand" "w")
> -   (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)]
> -  "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> -{
> -  rtx ignore_wb = gen_reg_rtx (V4SImode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_fv4sf_insn (ignore_wb,operands[0],
> -                                          operands[1], operands[2]));
> -  DONE;
> -})
> -
> -(define_expand "mve_vstrwq_scatter_base_wb_add_fv4sf"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SI 2 "s_register_operand" "0")
> -   (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)]
> -  "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> -{
> -  rtx ignore_vec = gen_reg_rtx (V4SFmode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_fv4sf_insn (operands[0], operands[2],
> -                                          operands[1], ignore_vec));
> -  DONE;
> -})
> -
>  ;;
>  ;; [vstrwq_scatter_base_wb_f]
>  ;;
> -(define_insn "mve_vstrwq_scatter_base_wb_fv4sf_insn"
> +(define_insn "mve_vstrwq_scatter_base_wb_fv4sf"
>   [(set (mem:BLK (scratch))
>         (unspec:BLK
>               [(match_operand:V4SI 1 "s_register_operand" "0")
> @@ -10479,42 +10391,10 @@
>  }
>    [(set_attr "length" "4")])
> 
> -(define_expand "mve_vstrwq_scatter_base_wb_p_fv4sf"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SF 2 "s_register_operand" "w")
> -   (match_operand:HI 3 "vpr_register_operand")
> -   (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)]
> -  "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> -{
> -  rtx ignore_wb = gen_reg_rtx (V4SImode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_p_fv4sf_insn (ignore_wb, operands[0],
> -                                            operands[1], operands[2],
> -                                            operands[3]));
> -  DONE;
> -})
> -
> -(define_expand "mve_vstrwq_scatter_base_wb_p_add_fv4sf"
> -  [(match_operand:V4SI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V4SI 2 "s_register_operand" "0")
> -   (match_operand:HI 3 "vpr_register_operand")
> -   (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)]
> -  "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> -{
> -  rtx ignore_vec = gen_reg_rtx (V4SFmode);
> -  emit_insn (
> -  gen_mve_vstrwq_scatter_base_wb_p_fv4sf_insn (operands[0],
> operands[2],
> -                                            operands[1], ignore_vec,
> -                                            operands[3]));
> -  DONE;
> -})
> -
>  ;;
>  ;; [vstrwq_scatter_base_wb_p_f]
>  ;;
> -(define_insn "mve_vstrwq_scatter_base_wb_p_fv4sf_insn"
> +(define_insn "mve_vstrwq_scatter_base_wb_p_fv4sf"
>   [(set (mem:BLK (scratch))
>         (unspec:BLK
>               [(match_operand:V4SI 1 "s_register_operand" "0")
> @@ -10537,38 +10417,10 @@
>  }
>    [(set_attr "length" "8")])
> 
> -(define_expand "mve_vstrdq_scatter_base_wb_<supf>v2di"
> -  [(match_operand:V2DI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V2DI 2 "s_register_operand" "w")
> -   (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_wb = gen_reg_rtx (V2DImode);
> -  emit_insn (
> -  gen_mve_vstrdq_scatter_base_wb_<supf>v2di_insn (ignore_wb,
> operands[0],
> -                                               operands[1], operands[2]));
> -  DONE;
> -})
> -
> -(define_expand "mve_vstrdq_scatter_base_wb_add_<supf>v2di"
> -  [(match_operand:V2DI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V2DI 2 "s_register_operand" "0")
> -   (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_vec = gen_reg_rtx (V2DImode);
> -  emit_insn (
> -  gen_mve_vstrdq_scatter_base_wb_<supf>v2di_insn (operands[0],
> operands[2],
> -                                               operands[1], ignore_vec));
> -  DONE;
> -})
> -
>  ;;
>  ;; [vstrdq_scatter_base_wb_s vstrdq_scatter_base_wb_u]
>  ;;
> -(define_insn "mve_vstrdq_scatter_base_wb_<supf>v2di_insn"
> +(define_insn "mve_vstrdq_scatter_base_wb_<supf>v2di"
>    [(set (mem:BLK (scratch))
>       (unspec:BLK
>               [(match_operand:V2DI 1 "s_register_operand" "0")
> @@ -10590,42 +10442,10 @@
>  }
>    [(set_attr "length" "4")])
> 
> -(define_expand "mve_vstrdq_scatter_base_wb_p_<supf>v2di"
> -  [(match_operand:V2DI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V2DI 2 "s_register_operand" "w")
> -   (match_operand:HI 3 "vpr_register_operand")
> -   (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_wb = gen_reg_rtx (V2DImode);
> -  emit_insn (
> -  gen_mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn (ignore_wb,
> operands[0],
> -                                                 operands[1], operands[2],
> -                                                 operands[3]));
> -  DONE;
> -})
> -
> -(define_expand "mve_vstrdq_scatter_base_wb_p_add_<supf>v2di"
> -  [(match_operand:V2DI 0 "s_register_operand" "=w")
> -   (match_operand:SI 1 "mve_vldrd_immediate" "Ri")
> -   (match_operand:V2DI 2 "s_register_operand" "0")
> -   (match_operand:HI 3 "vpr_register_operand")
> -   (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)]
> -  "TARGET_HAVE_MVE"
> -{
> -  rtx ignore_vec = gen_reg_rtx (V2DImode);
> -  emit_insn (
> -  gen_mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn (operands[0],
> operands[2],
> -                                                 operands[1], ignore_vec,
> -                                                 operands[3]));
> -  DONE;
> -})
> -
>  ;;
>  ;; [vstrdq_scatter_base_wb_p_s vstrdq_scatter_base_wb_p_u]
>  ;;
> -(define_insn "mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn"
> +(define_insn "mve_vstrdq_scatter_base_wb_p_<supf>v2di"
>    [(set (mem:BLK (scratch))
>       (unspec:BLK
>               [(match_operand:V2DI 1 "s_register_operand" "0")
> @@ -10643,7 +10463,7 @@
>     ops[0] = operands[1];
>     ops[1] = operands[2];
>     ops[2] = operands[3];
> -   output_asm_insn ("vpst\;\tvstrdt.u64\t%q2, [%q0, %1]!",ops);
> +   output_asm_insn ("vpst;vstrdt.u64\t%q2, [%q0, %1]!",ops);
>     return "";
>  }
>    [(set_attr "length" "8")])
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s
> 64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s
> 64.c
> index
> 6570d4abd23ecfaf9d279760814fddeb848712f5..319188b706fb737aef49dfd
> 3a6e64545a63f2087 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s
> 64.c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s
> 64.c
> @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, int64x2_t
> value, mve_pred16_t p)
>    vstrdq_scatter_base_wb_p_s64 (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrdt.u64"  }  } */
> -
>  void
>  foo1 (uint64x2_t * addr, const int offset, int64x2_t value, mve_pred16_t p)
>  {
>    vstrdq_scatter_base_wb_p (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrdt.u64"  }  } */
> +/* { dg-final { scan-assembler-times "vstrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u
> 64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u
> 64.c
> index
> 8444a3acd4c090f182bae7a8e144715f0dd56ba7..940b5421c840a1841d7e01
> 8abeef2342ab653f1b 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u
> 64.c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u
> 64.c
> @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, uint64x2_t
> value, mve_pred16_t p)
>    vstrdq_scatter_base_wb_p_u64 (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrdt.u64"  }  } */
> -
>  void
>  foo1 (uint64x2_t * addr, const int offset, uint64x2_t value, mve_pred16_t p)
>  {
>    vstrdq_scatter_base_wb_p (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrdt.u64"  }  } */
> +/* { dg-final { scan-assembler-times "vstrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.
> c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.
> c
> index
> e0ec283d10068da7db0847e11611adbcb386cfbc..33926d5c9e2e85188b222a
> 9c903a966c52195fa5 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.
> c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.
> c
> @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, int64x2_t
> value)
>    vstrdq_scatter_base_wb_s64 (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrd.u64"  }  } */
> -
>  void
>  foo1 (uint64x2_t * addr, const int offset, int64x2_t value)
>  {
>    vstrdq_scatter_base_wb (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrd.u64"  }  } */
> +/* { dg-final { scan-assembler-times "vstrd.u64\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64
> .c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64
> .c
> index
> fe41d6b5c74514cdec25a69e9f44f3df3493342b..b7ffcf9b5dd13db0f4785c3e
> e55231ec2b75d240 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64
> .c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64
> .c
> @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, uint64x2_t
> value)
>    vstrdq_scatter_base_wb_u64 (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrd.u64"  }  } */
> -
>  void
>  foo1 (uint64x2_t * addr, const int offset, uint64x2_t value)
>  {
>    vstrdq_scatter_base_wb (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrd.u64"  }  } */
> +/* { dg-final { scan-assembler-times "vstrd.u64\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32
> .c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32
> .c
> index
> f4ceabb8680c5044dae38bea3af351d5cd5d6085..b2cc6e555aeb0ce5415cefe
> 2970b8d7a711661f3 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32
> .c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32
> .c
> @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, float32x4_t
> value)
>    vstrwq_scatter_base_wb_f32 (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrw.u32"  }  } */
> -
>  void
>  foo1 (uint32x4_t * addr, const int offset, float32x4_t value)
>  {
>    vstrwq_scatter_base_wb (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrw.u32"  }  } */
> +/* { dg-final { scan-assembler-times "vstrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f
> 32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f
> 32.c
> index
> cb2eb685139fe0db53281136e1ba235988bc731a..4befd49d7b92b0fc4de498
> 8db91f9eec7b3d33ec 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f
> 32.c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f
> 32.c
> @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, float32x4_t
> value, mve_pred16_t p)
>    vstrwq_scatter_base_wb_p_f32 (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrwt.u32"  }  } */
> -
>  void
>  foo1 (uint32x4_t * addr, const int offset, float32x4_t value, mve_pred16_t p)
>  {
>    vstrwq_scatter_base_wb_p (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrwt.u32"  }  } */
> +/* { dg-final { scan-assembler-times "vstrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s
> 32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s
> 32.c
> index
> d973c021ba372ef31b493cca61655134def723d8..dfb1827c4f08232b63ceccf
> 89b2604fec2890a3f 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s
> 32.c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s
> 32.c
> @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, int32x4_t
> value, mve_pred16_t p)
>    vstrwq_scatter_base_wb_p_s32 (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrwt.u32"  }  } */
> -
>  void
>  foo1 (uint32x4_t * addr, const int offset, int32x4_t value, mve_pred16_t p)
>  {
>    vstrwq_scatter_base_wb_p (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrwt.u32"  }  } */
> +/* { dg-final { scan-assembler-times "vstrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_
> u32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_
> u32.c
> index
> c0f0964c657711b45b5ecdda6386ee3656bb221c..4eb78c600be9749fca86e2
> 89c67e388f78753532 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_
> u32.c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_
> u32.c
> @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, uint32x4_t
> value, mve_pred16_t p)
>    vstrwq_scatter_base_wb_p_u32 (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrwt.u32"  }  } */
> -
>  void
>  foo1 (uint32x4_t * addr, const int offset, uint32x4_t value, mve_pred16_t p)
>  {
>    vstrwq_scatter_base_wb_p (addr, 8, value, p);
>  }
> 
> -/* { dg-final { scan-assembler "vstrwt.u32"  }  } */
> +/* { dg-final { scan-assembler-times "vstrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32
> .c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32
> .c
> index
> 6ef095526e5eb21b3ed5d27c85566dadad07966e..618dbaf5aa69421ee80aca
> 62904ce915306c54fd 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32
> .c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32
> .c
> @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, int32x4_t
> value)
>    vstrwq_scatter_base_wb_s32 (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrw.u32"  }  } */
> -
>  void
>  foo1 (uint32x4_t * addr, const int offset, int32x4_t value)
>  {
>    vstrwq_scatter_base_wb (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrw.u32"  }  } */
> +/* { dg-final { scan-assembler-times "vstrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3
> 2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3
> 2.c
> index
> 620dffa8391eb3685288128fdcd076672babc0d6..912a4590cf54b10a91caee8
> d4ccc24ce59ab7950 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3
> 2.c
> +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3
> 2.c
> @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, uint32x4_t value)
>    vstrwq_scatter_base_wb_u32 (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrw.u32"  }  } */
> -
>  void
>  foo1 (uint32x4_t * addr, uint32x4_t value)
>  {
>    vstrwq_scatter_base_wb (addr, 8, value);
>  }
> 
> -/* { dg-final { scan-assembler "vstrw.u32"  }  } */
> +/* { dg-final { scan-assembler-times "vstrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> #\[0-9\]+\\\]!" 2 } } */

Reply via email to