On Wed, 11 Dec 2024 at 17:54, Richard Earnshaw (lists) <richard.earns...@arm.com> wrote: > > On 07/11/2024 09:18, Christophe Lyon wrote: > > This patch series re-implements the store_scatter and load_gather > > intrinsincs using the new framework, similarly to previous series. > > > > A few points worth mentioning: > > > > - unlike other intrinsics, these ones have the predicate after the > > mode in their names, hence the need for patch #1 > > > > - when checking the 'offset' argument of the *_base_* intrinsics, we > > need ranges with negative lower bounds, unlike what we needed so far > > (SVE does not have such negative bounds AFAIK), hence the need for > > patch #5 and the use of 'ss64' instead of 'su64' in signatures. > > > > - because of some pecularities in ACLE expected output wrt data type > > suffix (.16 vs .u16 vs .f16 for instance), I chose to update a few > > tests in patches #12 and #13, and to introduce a dedicated iterator > > in other cases (patch#10, using and fixing an existing iterator > > would have impact on Neon tests). I chose the approach which seemed > > the less invasive, but maybe we should aim at more consistency and > > update ACLE instead? > > > > Thanks for this, it's a nice cleanup. > > This patch series is OK. In fact, I think you should consider further > changes to move things from arm_mve.h inside the pragma as pre-approved, > once stage 1 re-opens, and provided they don't need to go significantly > beyond the type of changes needed here.
Thanks, I'll push this series soon. What about v2 of the latest MVE intrinsics series I posted recently (I sent v1 before the end of the previous stage 1)? https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671219.html > > One thing we should perhaps consider in future (ie not right now) is > whether we really need poly types in the Arm back-end code. Patch 8 > contains: > > > + machine_mode memory_vector_mode (const function_instance &fi) const > override > + { > + poly_uint64 nunits = GET_MODE_NUNITS (fi.vector_mode (0)); > + return arm_mve_data_mode (m_to_int_mode, nunits).require (); > + } > > But since poly types on Arm should degenerate into simple host wide > ints, this feels a bit like overkill. > Indeed, there are various cleanup / improvements to consider. Thanks, Christophe > R. > > > Thanks, > > > > Christophe > > > > Christophe Lyon (15): > > arm: [MVE intrinsics] add mode_after_pred helper in function_shape > > arm: [MVE intrinsics] add store_scatter_offset shape > > arm: [MVE intrinsics] rework vstr?q_scatter_offset > > arm: [MVE intrinsics] rework vstr_scatter_shifted_offset > > arm: [MVE intrinsics] Check immediate is a multiple in a range > > arm: [MVE intrinsics] Add store_scatter_base shape > > arm: [MVE intrinsics] rework vstr scatter_base > > arm: [MVE intrinsics] rework vstr scatter_base_wb > > arm: [MVE intrinsics] add load_ext_gather_offset shape > > arm: [MVE intrinsics] rework vldr gather_offset > > arm: [MVE intrinsics] rework vldr gather_shifted_offset > > arm: [MVE intrinsics] add load_gather_base shape > > arm: [MVE intrinsics] rework vldr gather_base > > arm: [MVE intrinsics] rework vldr gather_base_wb > > arm: [MVE intrinsics] remove useless call_properties implementations. > > > > gcc/config/arm/arm-builtins.cc | 146 - > > gcc/config/arm/arm-mve-builtins-base.cc | 279 +- > > gcc/config/arm/arm-mve-builtins-base.def | 28 + > > gcc/config/arm/arm-mve-builtins-base.h | 18 + > > gcc/config/arm/arm-mve-builtins-shapes.cc | 249 ++ > > gcc/config/arm/arm-mve-builtins-shapes.h | 4 + > > gcc/config/arm/arm-mve-builtins.cc | 99 +- > > gcc/config/arm/arm-mve-builtins.h | 5 + > > gcc/config/arm/arm_mve.h | 3096 ++--------------- > > gcc/config/arm/arm_mve_builtins.def | 122 - > > gcc/config/arm/iterators.md | 63 +- > > gcc/config/arm/mve.md | 2150 ++---------- > > gcc/config/arm/unspecs.md | 78 +- > > .../mve/intrinsics/vldrdq_gather_base_s64.c | 4 +- > > .../mve/intrinsics/vldrdq_gather_base_u64.c | 4 +- > > .../intrinsics/vldrdq_gather_base_wb_s64.c | 4 +- > > .../intrinsics/vldrdq_gather_base_wb_u64.c | 4 +- > > 17 files changed, 1348 insertions(+), 5005 deletions(-) > > >