> -----Original Message----- > From: Srinath Parvathaneni <srinath.parvathan...@arm.com> > Sent: 07 October 2020 07:14 > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov <kyrylo.tkac...@arm.com> > Subject: [PATCH][GCC] arm: Fix wrong code generated for mve scatter store > with writeback intrinsics with -O2 (PR97271). > > Hello, > > This patch fixes (PR97271) the wrong code-gen for mve scatter store with > writeback intrinsics with -O2. > > $cat bug.c > #include "arm_mve.h" > void > foo (uint32x4_t * addr, const int offset, int32x4_t value) > { > vstrwq_scatter_base_wb_s32 (addr, 8, value); > } > > $ arm-none-eabi-gcc bug.c -S -O2 -march=armv8.1-m.main+mve -mfloat- > abi=hard -o - > Without this patch: > ... > foo: > vldrw.32 q3, [r0] > vstrw.u32 q0, [q3, #8]! ---> (A) > vldr.64 d4, .L3 > vldr.64 d5, .L3+8 > vldrw.32 q3, [r0] > vstrw.u32 q2, [q3, #8]! ---> (B) > bx lr > ... > > With this patch: > ... > foo: > vldrw.32 q3, [r0] > vstrw.u32 q0, [q3, #8]! --> (C) > vstrw.32 q3, [r0] > bx lr > ... > > Without this patch 2 vstrw assembly instructions (A and B) are generated for > vstrwq_scatter_base_wb_s32 > intrinsic where as fix generates only one vstrw assembly instruction (C). > > Bootstrapped on arm-none-linux-gnueabihf and regression tested on arm- > none-eabi and found no regressions. > > Ok for master? Ok for GCC-10 branch? Ok for both. Thanks, Kyrill > > Regards, > Srinath. > > gcc/ChangeLog: > > 2020-10-06 Srinath Parvathaneni <srinath.parvathan...@arm.com> > > PR target/97291 > * config/arm/arm-builtins.c (arm_strsbwbs_qualifiers): Modify array. > (arm_strsbwbu_qualifiers): Likewise. > (arm_strsbwbs_p_qualifiers): Likewise. > (arm_strsbwbu_p_qualifiers): Likewise. > * config/arm/arm_mve.h (__arm_vstrdq_scatter_base_wb_s64): > Modify > function definition. > (__arm_vstrdq_scatter_base_wb_u64): Likewise. > (__arm_vstrdq_scatter_base_wb_p_s64): Likewise. > (__arm_vstrdq_scatter_base_wb_p_u64): Likewise. > (__arm_vstrwq_scatter_base_wb_p_s32): Likewise. > (__arm_vstrwq_scatter_base_wb_p_u32): Likewise. > (__arm_vstrwq_scatter_base_wb_s32): Likewise. > (__arm_vstrwq_scatter_base_wb_u32): Likewise. > (__arm_vstrwq_scatter_base_wb_f32): Likewise. > (__arm_vstrwq_scatter_base_wb_p_f32): Likewise. > * config/arm/arm_mve_builtins.def > (vstrwq_scatter_base_wb_add_u): Remove > expansion for the builtin. > (vstrwq_scatter_base_wb_add_s): Likewise. > (vstrwq_scatter_base_wb_add_f): Likewise. > (vstrdq_scatter_base_wb_add_u): Likewise. > (vstrdq_scatter_base_wb_add_s): Likewise. > (vstrwq_scatter_base_wb_p_add_u): Likewise. > (vstrwq_scatter_base_wb_p_add_s): Likewise. > (vstrwq_scatter_base_wb_p_add_f): Likewise. > (vstrdq_scatter_base_wb_p_add_u): Likewise. > (vstrdq_scatter_base_wb_p_add_s): Likewise. > * config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): > Remove > expand. > (mve_vstrwq_scatter_base_wb_add_<supf>v4si): Likewise. > (mve_vstrwq_scatter_base_wb_<supf>v4si_insn): Rename pattern > to ... > (mve_vstrwq_scatter_base_wb_<supf>v4si): This. > (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Remove expand. > (mve_vstrwq_scatter_base_wb_p_add_<supf>v4si): Likewise. > (mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn): Rename pattern > to ... > (mve_vstrwq_scatter_base_wb_p_<supf>v4si): This. > (mve_vstrwq_scatter_base_wb_fv4sf): Remove expand. > (mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise. > (mve_vstrwq_scatter_base_wb_fv4sf_insn): Rename pattern to ... > (mve_vstrwq_scatter_base_wb_fv4sf): This. > (mve_vstrwq_scatter_base_wb_p_fv4sf): Remove expand. > (mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise. > (mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Rename pattern to ... > (mve_vstrwq_scatter_base_wb_p_fv4sf): This. > (mve_vstrdq_scatter_base_wb_<supf>v2di): Remove expand. > (mve_vstrdq_scatter_base_wb_add_<supf>v2di): Likewise. > (mve_vstrdq_scatter_base_wb_<supf>v2di_insn): Rename pattern > to ... > (mve_vstrdq_scatter_base_wb_<supf>v2di): This. > (mve_vstrdq_scatter_base_wb_p_<supf>v2di): Remove expand. > (mve_vstrdq_scatter_base_wb_p_add_<supf>v2di): Likewise. > (mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn): Rename pattern > to ... > (mve_vstrdq_scatter_base_wb_p_<supf>v2di): This. > > gcc/testsuite/ChangeLog: > > PR target/97291 > * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c: > Modify. > * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c: > Likewise. > * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c: > Likewise. > > > ############### Attachment also inlined for ease of reply > ############### > > > diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c > index > 33e8015b1405033180adc9334bfec8583193777f..db505a4cbf9d19155a4dde > cb40877b5cc7ee95e6 100644 > --- a/gcc/config/arm/arm-builtins.c > +++ b/gcc/config/arm/arm-builtins.c > @@ -811,23 +811,23 @@ > arm_ldrgbwbu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS] > > static enum arm_type_qualifiers > arm_strsbwbs_qualifiers[SIMD_MAX_BUILTIN_ARGS] > - = { qualifier_void, qualifier_unsigned, qualifier_const, qualifier_none}; > + = { qualifier_unsigned, qualifier_unsigned, qualifier_const, > qualifier_none}; > #define STRSBWBS_QUALIFIERS (arm_strsbwbs_qualifiers) > > static enum arm_type_qualifiers > arm_strsbwbu_qualifiers[SIMD_MAX_BUILTIN_ARGS] > - = { qualifier_void, qualifier_unsigned, qualifier_const, > qualifier_unsigned}; > + = { qualifier_unsigned, qualifier_unsigned, qualifier_const, > qualifier_unsigned}; > #define STRSBWBU_QUALIFIERS (arm_strsbwbu_qualifiers) > > static enum arm_type_qualifiers > arm_strsbwbs_p_qualifiers[SIMD_MAX_BUILTIN_ARGS] > - = { qualifier_void, qualifier_unsigned, qualifier_const, > + = { qualifier_unsigned, qualifier_unsigned, qualifier_const, > qualifier_none, qualifier_unsigned}; > #define STRSBWBS_P_QUALIFIERS (arm_strsbwbs_p_qualifiers) > > static enum arm_type_qualifiers > arm_strsbwbu_p_qualifiers[SIMD_MAX_BUILTIN_ARGS] > - = { qualifier_void, qualifier_unsigned, qualifier_const, > + = { qualifier_unsigned, qualifier_unsigned, qualifier_const, > qualifier_unsigned, qualifier_unsigned}; > #define STRSBWBU_P_QUALIFIERS (arm_strsbwbu_p_qualifiers) > > diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h > index > 99cff41cccbe22f5f6bfe8db513092830885976c..39cac6fd0352a599075ad9eb > 1809a4ef18635037 100644 > --- a/gcc/config/arm/arm_mve.h > +++ b/gcc/config/arm/arm_mve.h > @@ -13993,64 +13993,56 @@ __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrdq_scatter_base_wb_s64 (uint64x2_t * __addr, const int > __offset, int64x2_t __value) > { > - __builtin_mve_vstrdq_scatter_base_wb_sv2di (*__addr, __offset, __value); > - __builtin_mve_vstrdq_scatter_base_wb_add_sv2di (*__addr, __offset, > *__addr); > + *__addr = __builtin_mve_vstrdq_scatter_base_wb_sv2di (*__addr, > __offset, __value); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrdq_scatter_base_wb_u64 (uint64x2_t * __addr, const int > __offset, uint64x2_t __value) > { > - __builtin_mve_vstrdq_scatter_base_wb_uv2di (*__addr, __offset, > __value); > - __builtin_mve_vstrdq_scatter_base_wb_add_uv2di (*__addr, __offset, > *__addr); > + *__addr = __builtin_mve_vstrdq_scatter_base_wb_uv2di (*__addr, > __offset, __value); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrdq_scatter_base_wb_p_s64 (uint64x2_t * __addr, const int > __offset, int64x2_t __value, mve_pred16_t __p) > { > - __builtin_mve_vstrdq_scatter_base_wb_p_sv2di (*__addr, __offset, > __value, __p); > - __builtin_mve_vstrdq_scatter_base_wb_p_add_sv2di (*__addr, __offset, > *__addr, __p); > + *__addr = __builtin_mve_vstrdq_scatter_base_wb_p_sv2di (*__addr, > __offset, __value, __p); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrdq_scatter_base_wb_p_u64 (uint64x2_t * __addr, const int > __offset, uint64x2_t __value, mve_pred16_t __p) > { > - __builtin_mve_vstrdq_scatter_base_wb_p_uv2di (*__addr, __offset, > __value, __p); > - __builtin_mve_vstrdq_scatter_base_wb_p_add_uv2di (*__addr, __offset, > *__addr, __p); > + *__addr = __builtin_mve_vstrdq_scatter_base_wb_p_uv2di (*__addr, > __offset, __value, __p); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrwq_scatter_base_wb_p_s32 (uint32x4_t * __addr, const int > __offset, int32x4_t __value, mve_pred16_t __p) > { > - __builtin_mve_vstrwq_scatter_base_wb_p_sv4si (*__addr, __offset, > __value, __p); > - __builtin_mve_vstrwq_scatter_base_wb_p_add_sv4si (*__addr, __offset, > *__addr, __p); > + *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_sv4si (*__addr, > __offset, __value, __p); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrwq_scatter_base_wb_p_u32 (uint32x4_t * __addr, const int > __offset, uint32x4_t __value, mve_pred16_t __p) > { > - __builtin_mve_vstrwq_scatter_base_wb_p_uv4si (*__addr, __offset, > __value, __p); > - __builtin_mve_vstrwq_scatter_base_wb_p_add_uv4si (*__addr, __offset, > *__addr, __p); > + *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_uv4si (*__addr, > __offset, __value, __p); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrwq_scatter_base_wb_s32 (uint32x4_t * __addr, const int > __offset, int32x4_t __value) > { > - __builtin_mve_vstrwq_scatter_base_wb_sv4si (*__addr, __offset, > __value); > - __builtin_mve_vstrwq_scatter_base_wb_add_sv4si (*__addr, __offset, > *__addr); > + *__addr = __builtin_mve_vstrwq_scatter_base_wb_sv4si (*__addr, > __offset, __value); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int > __offset, uint32x4_t __value) > { > - __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset, > __value); > - __builtin_mve_vstrwq_scatter_base_wb_add_uv4si (*__addr, __offset, > *__addr); > + *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, > __offset, __value); > } > > __extension__ extern __inline uint8x16_t > @@ -19158,16 +19150,14 @@ __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrwq_scatter_base_wb_f32 (uint32x4_t * __addr, const int > __offset, float32x4_t __value) > { > - __builtin_mve_vstrwq_scatter_base_wb_fv4sf (*__addr, __offset, > __value); > - __builtin_mve_vstrwq_scatter_base_wb_add_fv4sf (*__addr, __offset, > *__addr); > + *__addr = __builtin_mve_vstrwq_scatter_base_wb_fv4sf (*__addr, > __offset, __value); > } > > __extension__ extern __inline void > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int > __offset, float32x4_t __value, mve_pred16_t __p) > { > - __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, > __value, __p); > - __builtin_mve_vstrwq_scatter_base_wb_p_add_fv4sf (*__addr, __offset, > *__addr, __p); > + *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, > __offset, __value, __p); > } > > __extension__ extern __inline float16x8_t > diff --git a/gcc/config/arm/arm_mve_builtins.def > b/gcc/config/arm/arm_mve_builtins.def > index > 753e40a951d071c1ab77476a1cc4779e91689178..55d426fbd14da6a536209f > 04f7f38de21c68b720 100644 > --- a/gcc/config/arm/arm_mve_builtins.def > +++ b/gcc/config/arm/arm_mve_builtins.def > @@ -828,19 +828,9 @@ VAR3 > (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE, vidupq_m_n_u, v16qi, > v8hi, v4si) > VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_n_u, v16qi, v4si, > v8hi) > VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_n_u, v16qi, v4si, > v8hi) > VAR1 (STRSBWBU, vstrwq_scatter_base_wb_u, v4si) > -VAR1 (STRSBWBU, vstrwq_scatter_base_wb_add_u, v4si) > -VAR1 (STRSBWBU, vstrwq_scatter_base_wb_add_s, v4si) > -VAR1 (STRSBWBU, vstrwq_scatter_base_wb_add_f, v4sf) > VAR1 (STRSBWBU, vstrdq_scatter_base_wb_u, v2di) > -VAR1 (STRSBWBU, vstrdq_scatter_base_wb_add_u, v2di) > -VAR1 (STRSBWBU, vstrdq_scatter_base_wb_add_s, v2di) > VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_u, v4si) > -VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_add_u, v4si) > -VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_add_s, v4si) > -VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_add_f, v4sf) > VAR1 (STRSBWBU_P, vstrdq_scatter_base_wb_p_u, v2di) > -VAR1 (STRSBWBU_P, vstrdq_scatter_base_wb_p_add_u, v2di) > -VAR1 (STRSBWBU_P, vstrdq_scatter_base_wb_p_add_s, v2di) > VAR1 (STRSBWBS, vstrwq_scatter_base_wb_s, v4si) > VAR1 (STRSBWBS, vstrwq_scatter_base_wb_f, v4sf) > VAR1 (STRSBWBS, vstrdq_scatter_base_wb_s, v2di) > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md > index > 465b39a51b3a258295ed764f0e742932e5d59225..dea479b04cbe0d63878ac7 > 4f321bcd5f9263175c 100644 > --- a/gcc/config/arm/mve.md > +++ b/gcc/config/arm/mve.md > @@ -10315,38 +10315,10 @@ > [(set_attr "type" "mve_move") > (set_attr "length""8")]) > > -(define_expand "mve_vstrwq_scatter_base_wb_<supf>v4si" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SI 2 "s_register_operand" "w") > - (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_wb = gen_reg_rtx (V4SImode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_<supf>v4si_insn (ignore_wb, > operands[0], > - operands[1], operands[2])); > - DONE; > -}) > - > -(define_expand "mve_vstrwq_scatter_base_wb_add_<supf>v4si" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SI 2 "s_register_operand" "0") > - (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_vec = gen_reg_rtx (V4SImode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_<supf>v4si_insn (operands[0], > operands[2], > - operands[1], ignore_vec)); > - DONE; > -}) > - > ;; > -;; [vstrwq_scatter_base_wb_s vstrdq_scatter_base_wb_u] > +;; [vstrwq_scatter_base_wb_s vstrwq_scatter_base_wb_u] > ;; > -(define_insn "mve_vstrwq_scatter_base_wb_<supf>v4si_insn" > +(define_insn "mve_vstrwq_scatter_base_wb_<supf>v4si" > [(set (mem:BLK (scratch)) > (unspec:BLK > [(match_operand:V4SI 1 "s_register_operand" "0") > @@ -10368,42 +10340,10 @@ > } > [(set_attr "length" "4")]) > > -(define_expand "mve_vstrwq_scatter_base_wb_p_<supf>v4si" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SI 2 "s_register_operand" "w") > - (match_operand:HI 3 "vpr_register_operand") > - (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_wb = gen_reg_rtx (V4SImode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn (ignore_wb, > operands[0], > - operands[1], operands[2], > - operands[3])); > - DONE; > -}) > - > -(define_expand "mve_vstrwq_scatter_base_wb_p_add_<supf>v4si" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SI 2 "s_register_operand" "0") > - (match_operand:HI 3 "vpr_register_operand") > - (unspec:V4SI [(const_int 0)] VSTRWSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_vec = gen_reg_rtx (V4SImode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn (operands[0], > operands[2], > - operands[1], ignore_vec, > - operands[3])); > - DONE; > -}) > - > ;; > ;; [vstrwq_scatter_base_wb_p_s vstrwq_scatter_base_wb_p_u] > ;; > -(define_insn "mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn" > +(define_insn "mve_vstrwq_scatter_base_wb_p_<supf>v4si" > [(set (mem:BLK (scratch)) > (unspec:BLK > [(match_operand:V4SI 1 "s_register_operand" "0") > @@ -10426,38 +10366,10 @@ > } > [(set_attr "length" "8")]) > > -(define_expand "mve_vstrwq_scatter_base_wb_fv4sf" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SF 2 "s_register_operand" "w") > - (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)] > - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" > -{ > - rtx ignore_wb = gen_reg_rtx (V4SImode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_fv4sf_insn (ignore_wb,operands[0], > - operands[1], operands[2])); > - DONE; > -}) > - > -(define_expand "mve_vstrwq_scatter_base_wb_add_fv4sf" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SI 2 "s_register_operand" "0") > - (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)] > - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" > -{ > - rtx ignore_vec = gen_reg_rtx (V4SFmode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_fv4sf_insn (operands[0], operands[2], > - operands[1], ignore_vec)); > - DONE; > -}) > - > ;; > ;; [vstrwq_scatter_base_wb_f] > ;; > -(define_insn "mve_vstrwq_scatter_base_wb_fv4sf_insn" > +(define_insn "mve_vstrwq_scatter_base_wb_fv4sf" > [(set (mem:BLK (scratch)) > (unspec:BLK > [(match_operand:V4SI 1 "s_register_operand" "0") > @@ -10479,42 +10391,10 @@ > } > [(set_attr "length" "4")]) > > -(define_expand "mve_vstrwq_scatter_base_wb_p_fv4sf" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SF 2 "s_register_operand" "w") > - (match_operand:HI 3 "vpr_register_operand") > - (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)] > - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" > -{ > - rtx ignore_wb = gen_reg_rtx (V4SImode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_p_fv4sf_insn (ignore_wb, operands[0], > - operands[1], operands[2], > - operands[3])); > - DONE; > -}) > - > -(define_expand "mve_vstrwq_scatter_base_wb_p_add_fv4sf" > - [(match_operand:V4SI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V4SI 2 "s_register_operand" "0") > - (match_operand:HI 3 "vpr_register_operand") > - (unspec:V4SI [(const_int 0)] VSTRWQSBWB_F)] > - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" > -{ > - rtx ignore_vec = gen_reg_rtx (V4SFmode); > - emit_insn ( > - gen_mve_vstrwq_scatter_base_wb_p_fv4sf_insn (operands[0], > operands[2], > - operands[1], ignore_vec, > - operands[3])); > - DONE; > -}) > - > ;; > ;; [vstrwq_scatter_base_wb_p_f] > ;; > -(define_insn "mve_vstrwq_scatter_base_wb_p_fv4sf_insn" > +(define_insn "mve_vstrwq_scatter_base_wb_p_fv4sf" > [(set (mem:BLK (scratch)) > (unspec:BLK > [(match_operand:V4SI 1 "s_register_operand" "0") > @@ -10537,38 +10417,10 @@ > } > [(set_attr "length" "8")]) > > -(define_expand "mve_vstrdq_scatter_base_wb_<supf>v2di" > - [(match_operand:V2DI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V2DI 2 "s_register_operand" "w") > - (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_wb = gen_reg_rtx (V2DImode); > - emit_insn ( > - gen_mve_vstrdq_scatter_base_wb_<supf>v2di_insn (ignore_wb, > operands[0], > - operands[1], operands[2])); > - DONE; > -}) > - > -(define_expand "mve_vstrdq_scatter_base_wb_add_<supf>v2di" > - [(match_operand:V2DI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V2DI 2 "s_register_operand" "0") > - (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_vec = gen_reg_rtx (V2DImode); > - emit_insn ( > - gen_mve_vstrdq_scatter_base_wb_<supf>v2di_insn (operands[0], > operands[2], > - operands[1], ignore_vec)); > - DONE; > -}) > - > ;; > ;; [vstrdq_scatter_base_wb_s vstrdq_scatter_base_wb_u] > ;; > -(define_insn "mve_vstrdq_scatter_base_wb_<supf>v2di_insn" > +(define_insn "mve_vstrdq_scatter_base_wb_<supf>v2di" > [(set (mem:BLK (scratch)) > (unspec:BLK > [(match_operand:V2DI 1 "s_register_operand" "0") > @@ -10590,42 +10442,10 @@ > } > [(set_attr "length" "4")]) > > -(define_expand "mve_vstrdq_scatter_base_wb_p_<supf>v2di" > - [(match_operand:V2DI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V2DI 2 "s_register_operand" "w") > - (match_operand:HI 3 "vpr_register_operand") > - (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_wb = gen_reg_rtx (V2DImode); > - emit_insn ( > - gen_mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn (ignore_wb, > operands[0], > - operands[1], operands[2], > - operands[3])); > - DONE; > -}) > - > -(define_expand "mve_vstrdq_scatter_base_wb_p_add_<supf>v2di" > - [(match_operand:V2DI 0 "s_register_operand" "=w") > - (match_operand:SI 1 "mve_vldrd_immediate" "Ri") > - (match_operand:V2DI 2 "s_register_operand" "0") > - (match_operand:HI 3 "vpr_register_operand") > - (unspec:V2DI [(const_int 0)] VSTRDSBWBQ)] > - "TARGET_HAVE_MVE" > -{ > - rtx ignore_vec = gen_reg_rtx (V2DImode); > - emit_insn ( > - gen_mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn (operands[0], > operands[2], > - operands[1], ignore_vec, > - operands[3])); > - DONE; > -}) > - > ;; > ;; [vstrdq_scatter_base_wb_p_s vstrdq_scatter_base_wb_p_u] > ;; > -(define_insn "mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn" > +(define_insn "mve_vstrdq_scatter_base_wb_p_<supf>v2di" > [(set (mem:BLK (scratch)) > (unspec:BLK > [(match_operand:V2DI 1 "s_register_operand" "0") > @@ -10643,7 +10463,7 @@ > ops[0] = operands[1]; > ops[1] = operands[2]; > ops[2] = operands[3]; > - output_asm_insn ("vpst\;\tvstrdt.u64\t%q2, [%q0, %1]!",ops); > + output_asm_insn ("vpst;vstrdt.u64\t%q2, [%q0, %1]!",ops); > return ""; > } > [(set_attr "length" "8")]) > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s > 64.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s > 64.c > index > 6570d4abd23ecfaf9d279760814fddeb848712f5..319188b706fb737aef49dfd > 3a6e64545a63f2087 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s > 64.c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s > 64.c > @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, int64x2_t > value, mve_pred16_t p) > vstrdq_scatter_base_wb_p_s64 (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrdt.u64" } } */ > - > void > foo1 (uint64x2_t * addr, const int offset, int64x2_t value, mve_pred16_t p) > { > vstrdq_scatter_base_wb_p (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrdt.u64" } } */ > +/* { dg-final { scan-assembler-times "vstrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u > 64.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u > 64.c > index > 8444a3acd4c090f182bae7a8e144715f0dd56ba7..940b5421c840a1841d7e01 > 8abeef2342ab653f1b 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u > 64.c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u > 64.c > @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, uint64x2_t > value, mve_pred16_t p) > vstrdq_scatter_base_wb_p_u64 (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrdt.u64" } } */ > - > void > foo1 (uint64x2_t * addr, const int offset, uint64x2_t value, mve_pred16_t p) > { > vstrdq_scatter_base_wb_p (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrdt.u64" } } */ > +/* { dg-final { scan-assembler-times "vstrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64. > c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64. > c > index > e0ec283d10068da7db0847e11611adbcb386cfbc..33926d5c9e2e85188b222a > 9c903a966c52195fa5 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64. > c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64. > c > @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, int64x2_t > value) > vstrdq_scatter_base_wb_s64 (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrd.u64" } } */ > - > void > foo1 (uint64x2_t * addr, const int offset, int64x2_t value) > { > vstrdq_scatter_base_wb (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrd.u64" } } */ > +/* { dg-final { scan-assembler-times "vstrd.u64\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64 > .c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64 > .c > index > fe41d6b5c74514cdec25a69e9f44f3df3493342b..b7ffcf9b5dd13db0f4785c3e > e55231ec2b75d240 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64 > .c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64 > .c > @@ -10,12 +10,10 @@ foo (uint64x2_t * addr, const int offset, uint64x2_t > value) > vstrdq_scatter_base_wb_u64 (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrd.u64" } } */ > - > void > foo1 (uint64x2_t * addr, const int offset, uint64x2_t value) > { > vstrdq_scatter_base_wb (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrd.u64" } } */ > +/* { dg-final { scan-assembler-times "vstrd.u64\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32 > .c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32 > .c > index > f4ceabb8680c5044dae38bea3af351d5cd5d6085..b2cc6e555aeb0ce5415cefe > 2970b8d7a711661f3 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32 > .c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32 > .c > @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, float32x4_t > value) > vstrwq_scatter_base_wb_f32 (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrw.u32" } } */ > - > void > foo1 (uint32x4_t * addr, const int offset, float32x4_t value) > { > vstrwq_scatter_base_wb (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrw.u32" } } */ > +/* { dg-final { scan-assembler-times "vstrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f > 32.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f > 32.c > index > cb2eb685139fe0db53281136e1ba235988bc731a..4befd49d7b92b0fc4de498 > 8db91f9eec7b3d33ec 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f > 32.c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f > 32.c > @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, float32x4_t > value, mve_pred16_t p) > vstrwq_scatter_base_wb_p_f32 (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrwt.u32" } } */ > - > void > foo1 (uint32x4_t * addr, const int offset, float32x4_t value, mve_pred16_t p) > { > vstrwq_scatter_base_wb_p (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrwt.u32" } } */ > +/* { dg-final { scan-assembler-times "vstrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s > 32.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s > 32.c > index > d973c021ba372ef31b493cca61655134def723d8..dfb1827c4f08232b63ceccf > 89b2604fec2890a3f 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s > 32.c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s > 32.c > @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, int32x4_t > value, mve_pred16_t p) > vstrwq_scatter_base_wb_p_s32 (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrwt.u32" } } */ > - > void > foo1 (uint32x4_t * addr, const int offset, int32x4_t value, mve_pred16_t p) > { > vstrwq_scatter_base_wb_p (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrwt.u32" } } */ > +/* { dg-final { scan-assembler-times "vstrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_ > u32.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_ > u32.c > index > c0f0964c657711b45b5ecdda6386ee3656bb221c..4eb78c600be9749fca86e2 > 89c67e388f78753532 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_ > u32.c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_ > u32.c > @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, uint32x4_t > value, mve_pred16_t p) > vstrwq_scatter_base_wb_p_u32 (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrwt.u32" } } */ > - > void > foo1 (uint32x4_t * addr, const int offset, uint32x4_t value, mve_pred16_t p) > { > vstrwq_scatter_base_wb_p (addr, 8, value, p); > } > > -/* { dg-final { scan-assembler "vstrwt.u32" } } */ > +/* { dg-final { scan-assembler-times "vstrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32 > .c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32 > .c > index > 6ef095526e5eb21b3ed5d27c85566dadad07966e..618dbaf5aa69421ee80aca > 62904ce915306c54fd 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32 > .c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32 > .c > @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, const int offset, int32x4_t > value) > vstrwq_scatter_base_wb_s32 (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrw.u32" } } */ > - > void > foo1 (uint32x4_t * addr, const int offset, int32x4_t value) > { > vstrwq_scatter_base_wb (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrw.u32" } } */ > +/* { dg-final { scan-assembler-times "vstrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */ > diff --git > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3 > 2.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3 > 2.c > index > 620dffa8391eb3685288128fdcd076672babc0d6..912a4590cf54b10a91caee8 > d4ccc24ce59ab7950 100644 > --- > a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3 > 2.c > +++ > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u3 > 2.c > @@ -10,12 +10,10 @@ foo (uint32x4_t * addr, uint32x4_t value) > vstrwq_scatter_base_wb_u32 (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrw.u32" } } */ > - > void > foo1 (uint32x4_t * addr, uint32x4_t value) > { > vstrwq_scatter_base_wb (addr, 8, value); > } > > -/* { dg-final { scan-assembler "vstrw.u32" } } */ > +/* { dg-final { scan-assembler-times "vstrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, > #\[0-9\]+\\\]!" 2 } } */
RE: [PATCH][GCC] arm: Fix wrong code generated for mve scatter store with writeback intrinsics with -O2 (PR97271).
Kyrylo Tkachov via Gcc-patches Fri, 16 Oct 2020 03:21:24 -0700
- [PATCH][GCC] arm: Fix wrong code gene... Srinath Parvathaneni via Gcc-patches
- RE: [PATCH][GCC] arm: Fix wrong ... Kyrylo Tkachov via Gcc-patches