On 25/10/2024 19:47, Christophe Lyon wrote:
> From: Alfie Richards <alfie.richa...@arm.com>
> 
> Implement the mve vld and vst intrinsics using the MVE builtins framework.
> 
> The main part of the patch is to reimplement to vstr/vldr patterns
> such that we now have much fewer of them:
> - non-truncating stores
> - predicated non-truncating stores
> - truncating stores
> - predicated truncating stores
> - non-extending loads
> - predicated non-extending loads
> - extending loads
> - predicated extending loads
> 
> This enables us to update the implementation of vld1/vst1 and use the
> new vldr/vstr builtins.
> 
> The patch also adds support for the predicated vld1/vst1 versions.
> 
> gcc.target/arm/pr112337.c needs an update, to call the intrinsic
> instead of the builtin, which this patch deletes.
> 
> 2024-09-11  Alfie Richards  <alfie.richa...@arm.com>
>           Christophe Lyon  <christophe.l...@arm.com>
> 
>       gcc/
> 
>       * config/arm/arm-mve-builtins-base.cc (vld1q_impl): Add support
>       for predicated version.
>       (vst1q_impl): Likewise.
>       (vstrq_impl): New class.
>       (vldrq_impl): New class.
>       (vldrbq): New.
>       (vldrhq): New.
>       (vldrwq): New.
>       (vstrbq): New.
>       (vstrhq): New.
>       (vstrwq): New.
>       * config/arm/arm-mve-builtins-base.def (vld1q): Add predicated
>       version.
>       (vldrbq): New.
>       (vldrhq): New.
>       (vldrwq): New.
>       (vst1q): Add predicated version.
>       (vstrbq): New.
>       (vstrhq): New.
>       (vstrwq): New.
>       (vrev32q): Update types to float_16.
>       * config/arm/arm-mve-builtins-base.h (vldrbq): New.
>       (vldrhq): New.
>       (vldrwq): New.
>       (vstrbq): New.
>       (vstrhq): New.
>       (vstrwq): New.
>       * config/arm/arm-mve-builtins-functions.h (memory_vector_mode):
>       Remove conversion of floating point vectors to integer.
>       * config/arm/arm-mve-builtins.cc (TYPES_float16): Change to...
>       (TYPES_float_16): ...this.
>       (TYPES_float_32): New.
>       (float16): Change to...
>       (float_16): ...this.
>       (float_32): New.
>       (preds_z_or_none): New.
>       (function_resolver::check_gp_argument): Add support for _z
>       predicate.
>       * config/arm/arm_mve.h (vstrbq): Remove.
>       (vstrbq_p): Likewise.
>       (vstrhq): Likewise.
>       (vstrhq_p): Likewise.
>       (vstrwq): Likewise.
>       (vstrwq_p): Likewise.
>       (vst1q_p): Likewise.
>       (vld1q_z): Likewise.
>       (vldrbq_s8): Likewise.
>       (vldrbq_u8): Likewise.
>       (vldrbq_s16): Likewise.
>       (vldrbq_u16): Likewise.
>       (vldrbq_s32): Likewise.
>       (vldrbq_u32): Likewise.
>       (vstrbq_s8): Likewise.
>       (vstrbq_s32): Likewise.
>       (vstrbq_s16): Likewise.
>       (vstrbq_u8): Likewise.
>       (vstrbq_u32): Likewise.
>       (vstrbq_u16): Likewise.
>       (vstrbq_p_s8): Likewise.
>       (vstrbq_p_s32): Likewise.
>       (vstrbq_p_s16): Likewise.
>       (vstrbq_p_u8): Likewise.
>       (vstrbq_p_u32): Likewise.
>       (vstrbq_p_u16): Likewise.
>       (vldrbq_z_s16): Likewise.
>       (vldrbq_z_u8): Likewise.
>       (vldrbq_z_s8): Likewise.
>       (vldrbq_z_s32): Likewise.
>       (vldrbq_z_u16): Likewise.
>       (vldrbq_z_u32): Likewise.
>       (vldrhq_s32): Likewise.
>       (vldrhq_s16): Likewise.
>       (vldrhq_u32): Likewise.
>       (vldrhq_u16): Likewise.
>       (vldrhq_z_s32): Likewise.
>       (vldrhq_z_s16): Likewise.
>       (vldrhq_z_u32): Likewise.
>       (vldrhq_z_u16): Likewise.
>       (vldrwq_s32): Likewise.
>       (vldrwq_u32): Likewise.
>       (vldrwq_z_s32): Likewise.
>       (vldrwq_z_u32): Likewise.
>       (vldrhq_f16): Likewise.
>       (vldrhq_z_f16): Likewise.
>       (vldrwq_f32): Likewise.
>       (vldrwq_z_f32): Likewise.
>       (vstrhq_f16): Likewise.
>       (vstrhq_s32): Likewise.
>       (vstrhq_s16): Likewise.
>       (vstrhq_u32): Likewise.
>       (vstrhq_u16): Likewise.
>       (vstrhq_p_f16): Likewise.
>       (vstrhq_p_s32): Likewise.
>       (vstrhq_p_s16): Likewise.
>       (vstrhq_p_u32): Likewise.
>       (vstrhq_p_u16): Likewise.
>       (vstrwq_f32): Likewise.
>       (vstrwq_s32): Likewise.
>       (vstrwq_u32): Likewise.
>       (vstrwq_p_f32): Likewise.
>       (vstrwq_p_s32): Likewise.
>       (vstrwq_p_u32): Likewise.
>       (vst1q_p_u8): Likewise.
>       (vst1q_p_s8): Likewise.
>       (vld1q_z_u8): Likewise.
>       (vld1q_z_s8): Likewise.
>       (vst1q_p_u16): Likewise.
>       (vst1q_p_s16): Likewise.
>       (vld1q_z_u16): Likewise.
>       (vld1q_z_s16): Likewise.
>       (vst1q_p_u32): Likewise.
>       (vst1q_p_s32): Likewise.
>       (vld1q_z_u32): Likewise.
>       (vld1q_z_s32): Likewise.
>       (vld1q_z_f16): Likewise.
>       (vst1q_p_f16): Likewise.
>       (vld1q_z_f32): Likewise.
>       (vst1q_p_f32): Likewise.
>       (__arm_vstrbq_s8): Likewise.
>       (__arm_vstrbq_s32): Likewise.
>       (__arm_vstrbq_s16): Likewise.
>       (__arm_vstrbq_u8): Likewise.
>       (__arm_vstrbq_u32): Likewise.
>       (__arm_vstrbq_u16): Likewise.
>       (__arm_vldrbq_s8): Likewise.
>       (__arm_vldrbq_u8): Likewise.
>       (__arm_vldrbq_s16): Likewise.
>       (__arm_vldrbq_u16): Likewise.
>       (__arm_vldrbq_s32): Likewise.
>       (__arm_vldrbq_u32): Likewise.
>       (__arm_vstrbq_p_s8): Likewise.
>       (__arm_vstrbq_p_s32): Likewise.
>       (__arm_vstrbq_p_s16): Likewise.
>       (__arm_vstrbq_p_u8): Likewise.
>       (__arm_vstrbq_p_u32): Likewise.
>       (__arm_vstrbq_p_u16): Likewise.
>       (__arm_vldrbq_z_s8): Likewise.
>       (__arm_vldrbq_z_s32): Likewise.
>       (__arm_vldrbq_z_s16): Likewise.
>       (__arm_vldrbq_z_u8): Likewise.
>       (__arm_vldrbq_z_u32): Likewise.
>       (__arm_vldrbq_z_u16): Likewise.
>       (__arm_vldrhq_s32): Likewise.
>       (__arm_vldrhq_s16): Likewise.
>       (__arm_vldrhq_u32): Likewise.
>       (__arm_vldrhq_u16): Likewise.
>       (__arm_vldrhq_z_s32): Likewise.
>       (__arm_vldrhq_z_s16): Likewise.
>       (__arm_vldrhq_z_u32): Likewise.
>       (__arm_vldrhq_z_u16): Likewise.
>       (__arm_vldrwq_s32): Likewise.
>       (__arm_vldrwq_u32): Likewise.
>       (__arm_vldrwq_z_s32): Likewise.
>       (__arm_vldrwq_z_u32): Likewise.
>       (__arm_vstrhq_s32): Likewise.
>       (__arm_vstrhq_s16): Likewise.
>       (__arm_vstrhq_u32): Likewise.
>       (__arm_vstrhq_u16): Likewise.
>       (__arm_vstrhq_p_s32): Likewise.
>       (__arm_vstrhq_p_s16): Likewise.
>       (__arm_vstrhq_p_u32): Likewise.
>       (__arm_vstrhq_p_u16): Likewise.
>       (__arm_vstrwq_s32): Likewise.
>       (__arm_vstrwq_u32): Likewise.
>       (__arm_vstrwq_p_s32): Likewise.
>       (__arm_vstrwq_p_u32): Likewise.
>       (__arm_vst1q_p_u8): Likewise.
>       (__arm_vst1q_p_s8): Likewise.
>       (__arm_vld1q_z_u8): Likewise.
>       (__arm_vld1q_z_s8): Likewise.
>       (__arm_vst1q_p_u16): Likewise.
>       (__arm_vst1q_p_s16): Likewise.
>       (__arm_vld1q_z_u16): Likewise.
>       (__arm_vld1q_z_s16): Likewise.
>       (__arm_vst1q_p_u32): Likewise.
>       (__arm_vst1q_p_s32): Likewise.
>       (__arm_vld1q_z_u32): Likewise.
>       (__arm_vld1q_z_s32): Likewise.
>       (__arm_vldrwq_f32): Likewise.
>       (__arm_vldrwq_z_f32): Likewise.
>       (__arm_vldrhq_z_f16): Likewise.
>       (__arm_vldrhq_f16): Likewise.
>       (__arm_vstrwq_p_f32): Likewise.
>       (__arm_vstrwq_f32): Likewise.
>       (__arm_vstrhq_f16): Likewise.
>       (__arm_vstrhq_p_f16): Likewise.
>       (__arm_vld1q_z_f16): Likewise.
>       (__arm_vst1q_p_f16): Likewise.
>       (__arm_vld1q_z_f32): Likewise.
>       (__arm_vst2q_f32): Likewise.
>       (__arm_vst1q_p_f32): Likewise.
>       (__arm_vstrbq): Likewise.
>       (__arm_vstrbq_p): Likewise.
>       (__arm_vstrhq): Likewise.
>       (__arm_vstrhq_p): Likewise.
>       (__arm_vstrwq): Likewise.
>       (__arm_vstrwq_p): Likewise.
>       (__arm_vst1q_p): Likewise.
>       (__arm_vld1q_z): Likewise.
>       * config/arm/arm_mve_builtins.def:
>       (vstrbq_s): Delete.
>       (vstrbq_u): Likewise.
>       (vldrbq_s): Likewise.
>       (vldrbq_u): Likewise.
>       (vstrbq_p_s): Likewise.
>       (vstrbq_p_u): Likewise.
>       (vldrbq_z_s): Likewise.
>       (vldrbq_z_u): Likewise.
>       (vld1q_u): Likewise.
>       (vld1q_s): Likewise.
>       (vldrhq_z_u): Likewise.
>       (vldrhq_u): Likewise.
>       (vldrhq_z_s): Likewise.
>       (vldrhq_s): Likewise.
>       (vld1q_f): Likewise.
>       (vldrhq_f): Likewise.
>       (vldrhq_z_f): Likewise.
>       (vldrwq_f): Likewise.
>       (vldrwq_s): Likewise.
>       (vldrwq_u): Likewise.
>       (vldrwq_z_f): Likewise.
>       (vldrwq_z_s): Likewise.
>       (vldrwq_z_u): Likewise.
>       (vst1q_u): Likewise.
>       (vst1q_s): Likewise.
>       (vstrhq_p_u): Likewise.
>       (vstrhq_u): Likewise.
>       (vstrhq_p_s): Likewise.
>       (vstrhq_s): Likewise.
>       (vst1q_f): Likewise.
>       (vstrhq_f): Likewise.
>       (vstrhq_p_f): Likewise.
>       (vstrwq_f): Likewise.
>       (vstrwq_s): Likewise.
>       (vstrwq_u): Likewise.
>       (vstrwq_p_f): Likewise.
>       (vstrwq_p_s): Likewise.
>       (vstrwq_p_u): Likewise.
>       * config/arm/iterators.md (MVE_w_narrow_TYPE): New iterator.
>       (MVE_w_narrow_type): New iterator.
>       (MVE_wide_n_TYPE): New attribute.
>       (MVE_wide_n_type): New attribute.
>       (MVE_wide_n_sz_elem): New attribute.
>       (MVE_wide_n_VPRED): New attribute.
>       (MVE_elem_ch): New attribute.
>       (supf): Remove VSTRBQ_S, VSTRBQ_U, VLDRBQ_S, VLDRBQ_U, VLD1Q_S,
>       VLD1Q_U, VLDRHQ_S, VLDRHQ_U, VLDRWQ_S, VLDRWQ_U, VST1Q_S, VST1Q_U,
>       VSTRHQ_S, VSTRHQ_U, VSTRWQ_S, VSTRWQ_U.
>       (VSTRBQ, VLDRBQ, VLD1Q, VLDRHQ, VLDRWQ, VST1Q, VSTRHQ, VSTRWQ):
>       Delete.
>       * config/arm/mve.md (mve_vstrbq_<supf><mode>): Remove.
>       (mve_vldrbq_<supf><mode>): Likewise.
>       (mve_vstrbq_p_<supf><mode>): Likewise.
>       (mve_vldrbq_z_<supf><mode>): Likewise.
>       (mve_vldrhq_fv8hf): Likewise.
>       (mve_vldrhq_<supf><mode>): Likewise.
>       (mve_vldrhq_z_fv8hf): Likewise.
>       (mve_vldrhq_z_<supf><mode>): Likewise.
>       (mve_vldrwq_fv4sf): Likewise.
>       (mve_vldrwq_<supf>v4si): Likewise.
>       (mve_vldrwq_z_fv4sf): Likewise.
>       (mve_vldrwq_z_<supf>v4si): Likewise.
>       (@mve_vld1q_f<mode>): Likewise.
>       (@mve_vld1q_<supf><mode>): Likewise.
>       (mve_vstrhq_fv8hf): Likewise.
>       (mve_vstrhq_p_fv8hf): Likewise.
>       (mve_vstrhq_p_<supf><mode>): Likewise.
>       (mve_vstrhq_<supf><mode>): Likewise.
>       (mve_vstrwq_fv4sf): Likewise.
>       (mve_vstrwq_p_fv4sf): Likewise.
>       (mve_vstrwq_p_<supf>v4si): Likewise.
>       (mve_vstrwq_<supf>v4si): Likewise.
>       (@mve_vst1q_f<mode>): Likewise.
>       (@mve_vst1q_<supf><mode>): Likewise.
>       (@mve_vstrq_<mode>): New.
>       (@mve_vstrq_p_<mode>): New.
>       (@mve_vstrq_truncate_<mode>): New.
>       (@mve_vstrq_p_truncate_<mode>): New.
>       (@mve_vldrq_<mode>): New.
>       (@mve_vldrq_z_<mode>): New.
>       (@mve_vldrq_extend_<mode><US>): New.
>       (@mve_vldrq_z_extend_<mode><US>): New.
>       * config/arm/unspecs.md:
>       (VSTRBQ_S): Remove.
>       (VSTRBQ_U): Likewise.
>       (VLDRBQ_S): Likewise.
>       (VLDRBQ_U): Likewise.
>       (VLD1Q_F): Likewise.
>       (VLD1Q_S): Likewise.
>       (VLD1Q_U): Likewise.
>       (VLDRHQ_F): Likewise.
>       (VLDRHQ_U): Likewise.
>       (VLDRHQ_S): Likewise.
>       (VLDRWQ_F): Likewise.
>       (VLDRWQ_S): Likewise.
>       (VLDRWQ_U): Likewise.
>       (VSTRHQ_F): Likewise.
>       (VST1Q_S): Likewise.
>       (VST1Q_U): Likewise.
>       (VSTRHQ_U): Likewise.
>       (VSTRWQ_S): Likewise.
>       (VSTRWQ_U): Likewise.
>       (VSTRWQ_F): Likewise.
>       (VST1Q_F): Likewise.
>       (VLDRQ): New.
>       (VLDRQ_Z): Likewise.
>       (VLDRQ_EXT): Likewise.
>       (VLDRQ_EXT_Z): Likewise.
>       (VSTRQ): Likewise.
>       (VSTRQ_P): Likewise.
>       (VSTRQ_TRUNC): Likewise.
>       (VSTRQ_TRUNC_P): Likewise.
> 
>       gcc/testsuite/
>       * gcc.target/arm/pr112337.c: Call intrinsic instead of builtin.
> ---
>  gcc/config/arm/arm-mve-builtins-base.cc     |  134 ++-
>  gcc/config/arm/arm-mve-builtins-base.def    |   20 +-
>  gcc/config/arm/arm-mve-builtins-base.h      |    6 +
>  gcc/config/arm/arm-mve-builtins-functions.h |   13 -
>  gcc/config/arm/arm-mve-builtins.cc          |   15 +-
>  gcc/config/arm/arm_mve.h                    | 1010 +------------------
>  gcc/config/arm/arm_mve_builtins.def         |   38 -
>  gcc/config/arm/iterators.md                 |   37 +-
>  gcc/config/arm/mve.md                       |  649 ++++--------
>  gcc/config/arm/unspecs.md                   |   29 +-
>  gcc/testsuite/gcc.target/arm/pr112337.c     |    4 +-
>  11 files changed, 373 insertions(+), 1582 deletions(-)
> 

> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index 8c69670b161..a75b4a15dc0 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
[...]
> @@ -6368,6 +6098,7 @@ (define_expand "@arm_mve_reinterpret<mode>"
>    }
>  )
>  
> +
>  ;; Originally expanded by 'predicated_doloop_end'.
>  ;; In the rare situation where the branch is too far, we do also need to
>  ;; revert FPSCR.LTPSIZE back to 0x100 after the last iteration.

You forgot to remove this.

Otherwise this, and the rest of the series are OK.

R.

Reply via email to