Hi All,

This patch adds support for the FP16 multiply add/subtract instructions in 
Armv8.4-a.  Support for the new instructions is in the form of new ACLE 
intrinsics. A new command line feature modifier, +fp16fml, is added to enable 
the support. Enabling +fp16fml automatically enables +fp16.

Test cases were added to verify that the ACLE Intrinsics generate the 
appropriate FP16 multiply add/subtract assembly instructions.

Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all 
instructions assembly correctly.

Okay for trunk?

2017-11-10  Michael Collison  <michael.colli...@arm.com>

        * config/aarch64/aarch64-modes.def (V2HF): New VECTOR_MODE.
        * config/aarch64/aarch64-option-extension.def: Add
        AARCH64_OPT_EXTENSION of 'fp16fml'.
        * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
        (__ARM_FEATURE_FP16_FML): Define if TARGET_F16FML is true.
        * config/aarch64/predicates.md (aarch64_lane_imm3): New predicate.
        * config/aarch64/constraints.md (Ui7): New constraint.
        * config/aarch64/iterators.md (VFMLA_W): New mode iterator.
        (VFMLA_SEL_W): Ditto.
        (f16quad): Ditto.
        (f16mac1): Ditto.
        (VFMLA16_LOW): New int iterator.
        (VFMLA16_HIGH): Ditto.
        (UNSPEC_FMLAL): New unspec.
        (UNSPEC_FMLSL): Ditto.
        (UNSPEC_FMLAL2): Ditto.
        (UNSPEC_FMLSL2): Ditto.
        (f16mac): New code attribute.
        * config/aarch64/aarch64-simd-builtins.def
        (aarch64_fmlal_lowv2sf): Ditto.
        (aarch64_fmlsl_lowv2sf): Ditto.
        (aarch64_fmlalq_lowv4sf): Ditto.
        (aarch64_fmlslq_lowv4sf): Ditto.
        (aarch64_fmlal_highv2sf): Ditto.
        (aarch64_fmlsl_highv2sf): Ditto.
        (aarch64_fmlalq_highv4sf): Ditto.
        (aarch64_fmlslq_highv4sf): Ditto.
        (aarch64_fmlal_lane_lowv2sf): Ditto.
        (aarch64_fmlsl_lane_lowv2sf): Ditto.
        (aarch64_fmlal_laneq_lowv2sf): Ditto.
        (aarch64_fmlsl_laneq_lowv2sf): Ditto.
        (aarch64_fmlalq_lane_lowv4sf): Ditto.
        (aarch64_fmlsl_lane_lowv4sf): Ditto.
        (aarch64_fmlalq_laneq_lowv4sf): Ditto.
        (aarch64_fmlsl_laneq_lowv4sf): Ditto.
        (aarch64_fmlal_lane_highv2sf): Ditto.
        (aarch64_fmlsl_lane_highv2sf): Ditto.
        (aarch64_fmlal_laneq_highv2sf): Ditto.
        (aarch64_fmlsl_laneq_highv2sf): Ditto.
        (aarch64_fmlalq_lane_highv4sf): Ditto.
        (aarch64_fmlsl_lane_highv4sf): Ditto.
        (aarch64_fmlalq_laneq_highv4sf): Ditto.
        (aarch64_fmlsl_laneq_highv4sf): Ditto.
        * config/aarch64/aarch64-simd.md:
        (aarch64_fml<f16mac1>l<f16quad>_low<mode>): New pattern.
        (aarch64_fml<f16mac1>l<f16quad>_high<mode>): Ditto.
        (aarch64_simd_fml<f16mac1>l<f16quad>_low<mode>): Ditto.
        (aarch64_simd_fml<f16mac1>l<f16quad>_high<mode>): Ditto.
        (aarch64_fml<f16mac1>l_lane_lowv2sf): Ditto.
        (aarch64_fml<f16mac1>l_lane_highv2sf): Ditto.
        (aarch64_simd_fml<f16mac>l_lane_lowv2sf): Ditto.
        (aarch64_simd_fml<f16mac>l_lane_highv2sf): Ditto.
        (aarch64_fml<f16mac1>lq_laneq_lowv4sf): Ditto.
        (aarch64_fml<f16mac1>lq_laneq_highv4sf): Ditto.
        (aarch64_simd_fml<f16mac>lq_laneq_lowv4sf): Ditto.
        (aarch64_simd_fml<f16mac>lq_laneq_highv4sf): Ditto.
        (aarch64_fml<f16mac1>l_laneq_lowv2sf): Ditto.
        (aarch64_fml<f16mac1>l_laneq_highv2sf): Ditto.
        (aarch64_simd_fml<f16mac>l_laneq_lowv2sf): Ditto.
        (aarch64_simd_fml<f16mac>l_laneq_highv2sf): Ditto.
        (aarch64_fml<f16mac1>lq_lane_lowv4sf): Ditto.
        (aarch64_fml<f16mac1>lq_lane_highv4sf): Ditto.
        (aarch64_simd_fml<f16mac>lq_lane_lowv4sf): Ditto.
        (aarch64_simd_fml<f16mac>lq_lane_highv4sf): Ditto.
        * config/aarch64/arm_neon.h (vfmlal_low_u32): New intrinsic.
        (vfmlsl_low_u32): Ditto.
        (vfmlalq_low_u32): Ditto.
        (vfmlslq_low_u32): Ditto.
        (vfmlal_high_u32): Ditto.
        (vfmlsl_high_u32): Ditto.
        (vfmlalq_high_u32): Ditto.
        (vfmlslq_high_u32): Ditto.
        (vfmlal_lane_low_u32): Ditto.
        (vfmlsl_lane_low_u32): Ditto.
        (vfmlal_laneq_low_u32): Ditto.
        (vfmlsl_laneq_low_u32): Ditto.
        (vfmlalq_lane_low_u32): Ditto.
        (vfmlslq_lane_low_u32): Ditto.
        (vfmlalq_laneq_low_u32): Ditto.
        (vfmlslq_laneq_low_u32): Ditto.
        (vfmlal_lane_high_u32): Ditto.
        (vfmlsl_lane_high_u32): Ditto.
        (vfmlal_laneq_high_u32): Ditto.
        (vfmlsl_laneq_high_u32): Ditto.
        (vfmlalq_lane_high_u32): Ditto.
        (vfmlslq_lane_high_u32): Ditto.
        (vfmlalq_laneq_high_u32): Ditto.
        (vfmlslq_laneq_high_u32): Ditto.
        * config/aarch64/aarch64.h (AARCH64_FL_F16SML): New flag.
        (AARCH64_FL_FOR_ARCH8_4): New.
        (AARCH64_ISA_F16FML): New ISA flag.
        (TARGET_F16FML): New feature flag for fp16fml.
        gcc.target/aarch64/fp16_fmul_high_1.c: New testcase.
        gcc.target/aarch64/fp16_fmul_high_2.c: New testcase.
        gcc.target/aarch64/fp16_fmul_high_3.c: New testcase.
        gcc.target/aarch64/fp16_fmul_high.h: New shared testcase.
        gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase.
        gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase.
        gcc.target/aarch64/fp16_fmul_lane_high_1.c: New testcase.
        gcc.target/aarch64/fp16_fmul_lane_high.h: New shared testcase.
        gcc.target/aarch64/fp16_fmul_low_1.c: New testcase.
        gcc.target/aarch64/fp16_fmul_low_2.c: New testcase.
        gcc.target/aarch64/fp16_fmul_low_2.c: New testcase.
        gcc.target/aarch64/fp16_fmul_low.h: New shared testcase.
        gcc.target/aarch64/fp16_fmul_lane_low_1.c: New testcase.
        gcc.target/aarch64/fp16_fmul_lane_low_2.c: New testcase.
        gcc.target/aarch64/fp16_fmul_lane_low_3.c: New testcase.
        gcc.target/aarch64/fp16_fmul_lane_low.h: New shared testcase.
        (doc/invoke.texi): Document new fp16fml option.

Attachment: v8_4_fp16fml.patch
Description: v8_4_fp16fml.patch

Reply via email to