Hi All, This patch adds support for the FP16 multiply add/subtract instructions in Armv8.4-a. Support for the new instructions is in the form of new ACLE intrinsics. A new command line feature modifier, +fp16fml, is added to enable the support. Enabling +fp16fml automatically enables +fp16.
Test cases were added to verify that the ACLE Intrinsics generate the appropriate FP16 multiply add/subtract assembly instructions. Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all instructions assembly correctly. Okay for trunk? 2017-11-10 Michael Collison <michael.colli...@arm.com> * config/aarch64/aarch64-modes.def (V2HF): New VECTOR_MODE. * config/aarch64/aarch64-option-extension.def: Add AARCH64_OPT_EXTENSION of 'fp16fml'. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): (__ARM_FEATURE_FP16_FML): Define if TARGET_F16FML is true. * config/aarch64/predicates.md (aarch64_lane_imm3): New predicate. * config/aarch64/constraints.md (Ui7): New constraint. * config/aarch64/iterators.md (VFMLA_W): New mode iterator. (VFMLA_SEL_W): Ditto. (f16quad): Ditto. (f16mac1): Ditto. (VFMLA16_LOW): New int iterator. (VFMLA16_HIGH): Ditto. (UNSPEC_FMLAL): New unspec. (UNSPEC_FMLSL): Ditto. (UNSPEC_FMLAL2): Ditto. (UNSPEC_FMLSL2): Ditto. (f16mac): New code attribute. * config/aarch64/aarch64-simd-builtins.def (aarch64_fmlal_lowv2sf): Ditto. (aarch64_fmlsl_lowv2sf): Ditto. (aarch64_fmlalq_lowv4sf): Ditto. (aarch64_fmlslq_lowv4sf): Ditto. (aarch64_fmlal_highv2sf): Ditto. (aarch64_fmlsl_highv2sf): Ditto. (aarch64_fmlalq_highv4sf): Ditto. (aarch64_fmlslq_highv4sf): Ditto. (aarch64_fmlal_lane_lowv2sf): Ditto. (aarch64_fmlsl_lane_lowv2sf): Ditto. (aarch64_fmlal_laneq_lowv2sf): Ditto. (aarch64_fmlsl_laneq_lowv2sf): Ditto. (aarch64_fmlalq_lane_lowv4sf): Ditto. (aarch64_fmlsl_lane_lowv4sf): Ditto. (aarch64_fmlalq_laneq_lowv4sf): Ditto. (aarch64_fmlsl_laneq_lowv4sf): Ditto. (aarch64_fmlal_lane_highv2sf): Ditto. (aarch64_fmlsl_lane_highv2sf): Ditto. (aarch64_fmlal_laneq_highv2sf): Ditto. (aarch64_fmlsl_laneq_highv2sf): Ditto. (aarch64_fmlalq_lane_highv4sf): Ditto. (aarch64_fmlsl_lane_highv4sf): Ditto. (aarch64_fmlalq_laneq_highv4sf): Ditto. (aarch64_fmlsl_laneq_highv4sf): Ditto. * config/aarch64/aarch64-simd.md: (aarch64_fml<f16mac1>l<f16quad>_low<mode>): New pattern. (aarch64_fml<f16mac1>l<f16quad>_high<mode>): Ditto. (aarch64_simd_fml<f16mac1>l<f16quad>_low<mode>): Ditto. (aarch64_simd_fml<f16mac1>l<f16quad>_high<mode>): Ditto. (aarch64_fml<f16mac1>l_lane_lowv2sf): Ditto. (aarch64_fml<f16mac1>l_lane_highv2sf): Ditto. (aarch64_simd_fml<f16mac>l_lane_lowv2sf): Ditto. (aarch64_simd_fml<f16mac>l_lane_highv2sf): Ditto. (aarch64_fml<f16mac1>lq_laneq_lowv4sf): Ditto. (aarch64_fml<f16mac1>lq_laneq_highv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_laneq_lowv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_laneq_highv4sf): Ditto. (aarch64_fml<f16mac1>l_laneq_lowv2sf): Ditto. (aarch64_fml<f16mac1>l_laneq_highv2sf): Ditto. (aarch64_simd_fml<f16mac>l_laneq_lowv2sf): Ditto. (aarch64_simd_fml<f16mac>l_laneq_highv2sf): Ditto. (aarch64_fml<f16mac1>lq_lane_lowv4sf): Ditto. (aarch64_fml<f16mac1>lq_lane_highv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_lane_lowv4sf): Ditto. (aarch64_simd_fml<f16mac>lq_lane_highv4sf): Ditto. * config/aarch64/arm_neon.h (vfmlal_low_u32): New intrinsic. (vfmlsl_low_u32): Ditto. (vfmlalq_low_u32): Ditto. (vfmlslq_low_u32): Ditto. (vfmlal_high_u32): Ditto. (vfmlsl_high_u32): Ditto. (vfmlalq_high_u32): Ditto. (vfmlslq_high_u32): Ditto. (vfmlal_lane_low_u32): Ditto. (vfmlsl_lane_low_u32): Ditto. (vfmlal_laneq_low_u32): Ditto. (vfmlsl_laneq_low_u32): Ditto. (vfmlalq_lane_low_u32): Ditto. (vfmlslq_lane_low_u32): Ditto. (vfmlalq_laneq_low_u32): Ditto. (vfmlslq_laneq_low_u32): Ditto. (vfmlal_lane_high_u32): Ditto. (vfmlsl_lane_high_u32): Ditto. (vfmlal_laneq_high_u32): Ditto. (vfmlsl_laneq_high_u32): Ditto. (vfmlalq_lane_high_u32): Ditto. (vfmlslq_lane_high_u32): Ditto. (vfmlalq_laneq_high_u32): Ditto. (vfmlslq_laneq_high_u32): Ditto. * config/aarch64/aarch64.h (AARCH64_FL_F16SML): New flag. (AARCH64_FL_FOR_ARCH8_4): New. (AARCH64_ISA_F16FML): New ISA flag. (TARGET_F16FML): New feature flag for fp16fml. gcc.target/aarch64/fp16_fmul_high_1.c: New testcase. gcc.target/aarch64/fp16_fmul_high_2.c: New testcase. gcc.target/aarch64/fp16_fmul_high_3.c: New testcase. gcc.target/aarch64/fp16_fmul_high.h: New shared testcase. gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase. gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase. gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase. gcc.target/aarch64/fp16_fmul_lane_high.h: New shared testcase. gcc.target/aarch64/fp16_fmul_low_1.c: New testcase. gcc.target/aarch64/fp16_fmul_low_2.c: New testcase. gcc.target/aarch64/fp16_fmul_low_2.c: New testcase. gcc.target/aarch64/fp16_fmul_low.h: New shared testcase. gcc.target/aarch64/fp16_fmul_lane_low_1.c: New testcase. gcc.target/aarch64/fp16_fmul_lane_low_2.c: New testcase. gcc.target/aarch64/fp16_fmul_lane_low_3.c: New testcase. gcc.target/aarch64/fp16_fmul_lane_low.h: New shared testcase. (doc/invoke.texi): Document new fp16fml option.
v8_4_fp16fml.patch
Description: v8_4_fp16fml.patch