This series adds mf8 variants of what I'll loosely call the existing "data movement" intrinsics, including the recent FEAT_LUT ones. I think this completes the FP8 intrinsic definitions.
Sorry that the series is so late. We did make a real effort to get it done by the end of stage 1, but there were some unexpected hitches. The current half-complete state of trunk means that we either need to apply this patch or disable the existing __ARM_FEATURE_FP8 definition. Tested on aarch64-linux-gnu and aarch64_be-elf (including ILP32). I also tested advsimd-intrinsics.exp on arm-eabi to make sure that the mf8 stuff was properly protected. I'll commit this when the prerequisite x86 changes in: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671924.html are approved, unless there are no comments before then. The main patch was co-authored by Saurabh. Thanks, Richard Richard Sandiford (4): aarch64: Macroise simd_type definitions aarch64: Use mf8 instead of f8 in builtin definitions aarch64: Add missing makefile dependency aarch64: Add mf8 data movement intrinsics gcc/config/aarch64/aarch64-builtins.cc | 934 +++++++-- gcc/config/aarch64/aarch64-builtins.h | 2 + gcc/config/aarch64/aarch64-protos.h | 2 + .../aarch64/aarch64-simd-pragma-builtins.def | 288 ++- gcc/config/aarch64/aarch64-simd.md | 60 +- gcc/config/aarch64/aarch64.cc | 2 +- gcc/config/aarch64/aarch64.md | 16 + gcc/config/aarch64/iterators.md | 1 + gcc/config/aarch64/t-aarch64 | 1 + .../aarch64/advsimd-intrinsics/arm-neon-ref.h | 40 + .../advsimd-intrinsics/compute-ref-data.h | 29 + .../aarch64/advsimd-intrinsics/vbsl.c | 20 + .../aarch64/advsimd-intrinsics/vcombine.c | 10 + .../aarch64/advsimd-intrinsics/vcreate.c | 9 + .../aarch64/advsimd-intrinsics/vdup-vmov.c | 34 + .../aarch64/advsimd-intrinsics/vdup_lane.c | 26 + .../aarch64/advsimd-intrinsics/vext.c | 18 + .../aarch64/advsimd-intrinsics/vget_high.c | 5 + .../aarch64/advsimd-intrinsics/vld1.c | 14 + .../aarch64/advsimd-intrinsics/vld1_dup.c | 34 + .../aarch64/advsimd-intrinsics/vld1_lane.c | 14 + .../aarch64/advsimd-intrinsics/vld1x2.c | 11 +- .../aarch64/advsimd-intrinsics/vld1x3.c | 8 +- .../aarch64/advsimd-intrinsics/vld1x4.c | 6 +- .../aarch64/advsimd-intrinsics/vldX.c | 134 ++ .../aarch64/advsimd-intrinsics/vldX_dup.c | 76 + .../aarch64/advsimd-intrinsics/vldX_lane.c | 65 +- .../aarch64/advsimd-intrinsics/vrev.c | 38 + .../aarch64/advsimd-intrinsics/vset_lane.c | 16 + .../aarch64/advsimd-intrinsics/vshuffle.inc | 14 + .../aarch64/advsimd-intrinsics/vst1_lane.c | 12 + .../aarch64/advsimd-intrinsics/vst1x2.c | 8 +- .../aarch64/advsimd-intrinsics/vst1x3.c | 8 +- .../aarch64/advsimd-intrinsics/vst1x4.c | 8 +- .../aarch64/advsimd-intrinsics/vstX_lane.c | 69 + .../aarch64/advsimd-intrinsics/vtbX.c | 59 +- .../aarch64/advsimd-intrinsics/vtrn.c | 20 + .../aarch64/advsimd-intrinsics/vtrn_half.c | 30 + .../aarch64/advsimd-intrinsics/vuzp.c | 20 + .../aarch64/advsimd-intrinsics/vuzp_half.c | 30 + .../aarch64/advsimd-intrinsics/vzip.c | 20 + .../aarch64/advsimd-intrinsics/vzip_half.c | 30 + gcc/testsuite/gcc.target/aarch64/simd/lut.c | 90 + .../gcc.target/aarch64/simd/mf8_data_1.c | 1822 +++++++++++++++++ .../gcc.target/aarch64/simd/mf8_data_2.c | 98 + .../gcc.target/aarch64/vdup_lane_1.c | 99 +- .../gcc.target/aarch64/vdup_lane_2.c | 45 +- gcc/testsuite/gcc.target/aarch64/vdup_n_1.c | 60 +- .../gcc.target/aarch64/vect_copy_lane_1.c | 12 +- 49 files changed, 4259 insertions(+), 208 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/mf8_data_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/mf8_data_2.c -- 2.25.1