Ping

> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
> On Behalf Of Tamar Christina
> Sent: Tuesday, December 11, 2018 15:47
> To: gcc-patches@gcc.gnu.org
> Cc: nd <n...@arm.com>; Ramana Radhakrishnan
> <ramana.radhakrish...@arm.com>; Richard Earnshaw
> <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov
> <kyrylo.tkac...@arm.com>
> Subject: [PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex
> mutliplication and addition
> 
> Hi All,
> 
> This patch adds NEON intrinsics and tests for the Armv8.3-a complex
> multiplication and add instructions with a rotate along the Argand plane.
> 
> The instructions are documented in the ArmARM[1] and the intrinsics
> specification will be published on the Arm website [2].
> 
> The Lane versions of these instructions are special in that they always 
> select a
> pair.
> using index 0 means selecting lane 0 and 1.  Because of this the range check
> for the intrinsics require special handling.
> 
> On Arm, in order to implement some of the lane intrinsics we're using the
> structure of the register file.  The lane variant of these instructions always
> select a D register, but the data itself can be stored in Q registers.  This 
> means
> that for single precision complex numbers you are only allowed to select D[0]
> but using the register file layout you can get the range 0-1 for lane indices 
> by
> selecting between Dn[0] and Dn+1[0].
> 
> Same reasoning applies for half float complex numbers, except there your D
> register indexes can be 0 or 1, so you have a total range of 4 elements (for a
> V8HF).
> 
> 
> [1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture-
> reference-manual-armv8-for-armv8-a-architecture-profile
> [2] https://developer.arm.com/docs/101028/latest
> 
> Bootstrapped Regtested on arm-none-gnueabihf and no issues.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 2018-12-11  Tamar Christina  <tamar.christ...@arm.com>
> 
>       * config/arm/arm-builtins.c
>       (enum arm_type_qualifiers): Add qualifier_lane_pair_index.
>       (MAC_LANE_PAIR_QUALIFIERS): New.
>       (arm_expand_builtin_args): Use it.
>       (arm_expand_builtin_1): Likewise.
>       * config/arm/arm-protos.h (neon_vcmla_lane_prepare_operands):
> New.
>       * config/arm/arm.c (neon_vcmla_lane_prepare_operands): New.
>       * config/arm/arm-c.c (arm_cpu_builtins): Add
> __ARM_FEATURE_COMPLEX.
>       * config/arm/arm_neon.h:
>       (vcadd_rot90_f16): New.
>       (vcaddq_rot90_f16): New.
>       (vcadd_rot270_f16): New.
>       (vcaddq_rot270_f16): New.
>       (vcmla_f16): New.
>       (vcmlaq_f16): New.
>       (vcmla_lane_f16): New.
>       (vcmla_laneq_f16): New.
>       (vcmlaq_lane_f16): New.
>       (vcmlaq_laneq_f16): New.
>       (vcmla_rot90_f16): New.
>       (vcmlaq_rot90_f16): New.
>       (vcmla_rot90_lane_f16): New.
>       (vcmla_rot90_laneq_f16): New.
>       (vcmlaq_rot90_lane_f16): New.
>       (vcmlaq_rot90_laneq_f16): New.
>       (vcmla_rot180_f16): New.
>       (vcmlaq_rot180_f16): New.
>       (vcmla_rot180_lane_f16): New.
>       (vcmla_rot180_laneq_f16): New.
>       (vcmlaq_rot180_lane_f16): New.
>       (vcmlaq_rot180_laneq_f16): New.
>       (vcmla_rot270_f16): New.
>       (vcmlaq_rot270_f16): New.
>       (vcmla_rot270_lane_f16): New.
>       (vcmla_rot270_laneq_f16): New.
>       (vcmlaq_rot270_lane_f16): New.
>       (vcmlaq_rot270_laneq_f16): New.
>       (vcadd_rot90_f32): New.
>       (vcaddq_rot90_f32): New.
>       (vcadd_rot270_f32): New.
>       (vcaddq_rot270_f32): New.
>       (vcmla_f32): New.
>       (vcmlaq_f32): New.
>       (vcmla_lane_f32): New.
>       (vcmla_laneq_f32): New.
>       (vcmlaq_lane_f32): New.
>       (vcmlaq_laneq_f32): New.
>       (vcmla_rot90_f32): New.
>       (vcmlaq_rot90_f32): New.
>       (vcmla_rot90_lane_f32): New.
>       (vcmla_rot90_laneq_f32): New.
>       (vcmlaq_rot90_lane_f32): New.
>       (vcmlaq_rot90_laneq_f32): New.
>       (vcmla_rot180_f32): New.
>       (vcmlaq_rot180_f32): New.
>       (vcmla_rot180_lane_f32): New.
>       (vcmla_rot180_laneq_f32): New.
>       (vcmlaq_rot180_lane_f32): New.
>       (vcmlaq_rot180_laneq_f32): New.
>       (vcmla_rot270_f32): New.
>       (vcmlaq_rot270_f32): New.
>       (vcmla_rot270_lane_f32): New.
>       (vcmla_rot270_laneq_f32): New.
>       (vcmlaq_rot270_lane_f32): New.
>       (vcmlaq_rot270_laneq_f32): New.
>       * config/arm/arm_neon_builtins.def (vcadd90, vcadd270, vcmla0,
> vcmla90,
>       vcmla180, vcmla270, vcmla_lane0, vcmla_lane90, vcmla_lane180,
> vcmla_lane270,
>       vcmla_laneq0, vcmla_laneq90, vcmla_laneq180, vcmla_laneq270,
>       vcmlaq_lane0, vcmlaq_lane90, vcmlaq_lane180, vcmlaq_lane270):
> New.
>       * config/arm/neon.md (neon_vcmla_lane<rot><mode>,
>       neon_vcmla_laneq<rot><mode>, neon_vcmlaq_lane<rot><mode>):
> New.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-12-11  Tamar Christina  <tamar.christ...@arm.com>
> 
>       * gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: Add
> AArch32 regexpr.
>       * gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c:
> Likewise.
> 
> --

Reply via email to