Ping
> -----Original Message----- > From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org> > On Behalf Of Tamar Christina > Sent: Tuesday, December 11, 2018 15:47 > To: gcc-patches@gcc.gnu.org > Cc: nd <n...@arm.com>; Ramana Radhakrishnan > <ramana.radhakrish...@arm.com>; Richard Earnshaw > <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov > <kyrylo.tkac...@arm.com> > Subject: [PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex > mutliplication and addition > > Hi All, > > This patch adds NEON intrinsics and tests for the Armv8.3-a complex > multiplication and add instructions with a rotate along the Argand plane. > > The instructions are documented in the ArmARM[1] and the intrinsics > specification will be published on the Arm website [2]. > > The Lane versions of these instructions are special in that they always > select a > pair. > using index 0 means selecting lane 0 and 1. Because of this the range check > for the intrinsics require special handling. > > On Arm, in order to implement some of the lane intrinsics we're using the > structure of the register file. The lane variant of these instructions always > select a D register, but the data itself can be stored in Q registers. This > means > that for single precision complex numbers you are only allowed to select D[0] > but using the register file layout you can get the range 0-1 for lane indices > by > selecting between Dn[0] and Dn+1[0]. > > Same reasoning applies for half float complex numbers, except there your D > register indexes can be 0 or 1, so you have a total range of 4 elements (for a > V8HF). > > > [1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture- > reference-manual-armv8-for-armv8-a-architecture-profile > [2] https://developer.arm.com/docs/101028/latest > > Bootstrapped Regtested on arm-none-gnueabihf and no issues. > > Ok for trunk? > > Thanks, > Tamar > > gcc/ChangeLog: > > 2018-12-11 Tamar Christina <tamar.christ...@arm.com> > > * config/arm/arm-builtins.c > (enum arm_type_qualifiers): Add qualifier_lane_pair_index. > (MAC_LANE_PAIR_QUALIFIERS): New. > (arm_expand_builtin_args): Use it. > (arm_expand_builtin_1): Likewise. > * config/arm/arm-protos.h (neon_vcmla_lane_prepare_operands): > New. > * config/arm/arm.c (neon_vcmla_lane_prepare_operands): New. > * config/arm/arm-c.c (arm_cpu_builtins): Add > __ARM_FEATURE_COMPLEX. > * config/arm/arm_neon.h: > (vcadd_rot90_f16): New. > (vcaddq_rot90_f16): New. > (vcadd_rot270_f16): New. > (vcaddq_rot270_f16): New. > (vcmla_f16): New. > (vcmlaq_f16): New. > (vcmla_lane_f16): New. > (vcmla_laneq_f16): New. > (vcmlaq_lane_f16): New. > (vcmlaq_laneq_f16): New. > (vcmla_rot90_f16): New. > (vcmlaq_rot90_f16): New. > (vcmla_rot90_lane_f16): New. > (vcmla_rot90_laneq_f16): New. > (vcmlaq_rot90_lane_f16): New. > (vcmlaq_rot90_laneq_f16): New. > (vcmla_rot180_f16): New. > (vcmlaq_rot180_f16): New. > (vcmla_rot180_lane_f16): New. > (vcmla_rot180_laneq_f16): New. > (vcmlaq_rot180_lane_f16): New. > (vcmlaq_rot180_laneq_f16): New. > (vcmla_rot270_f16): New. > (vcmlaq_rot270_f16): New. > (vcmla_rot270_lane_f16): New. > (vcmla_rot270_laneq_f16): New. > (vcmlaq_rot270_lane_f16): New. > (vcmlaq_rot270_laneq_f16): New. > (vcadd_rot90_f32): New. > (vcaddq_rot90_f32): New. > (vcadd_rot270_f32): New. > (vcaddq_rot270_f32): New. > (vcmla_f32): New. > (vcmlaq_f32): New. > (vcmla_lane_f32): New. > (vcmla_laneq_f32): New. > (vcmlaq_lane_f32): New. > (vcmlaq_laneq_f32): New. > (vcmla_rot90_f32): New. > (vcmlaq_rot90_f32): New. > (vcmla_rot90_lane_f32): New. > (vcmla_rot90_laneq_f32): New. > (vcmlaq_rot90_lane_f32): New. > (vcmlaq_rot90_laneq_f32): New. > (vcmla_rot180_f32): New. > (vcmlaq_rot180_f32): New. > (vcmla_rot180_lane_f32): New. > (vcmla_rot180_laneq_f32): New. > (vcmlaq_rot180_lane_f32): New. > (vcmlaq_rot180_laneq_f32): New. > (vcmla_rot270_f32): New. > (vcmlaq_rot270_f32): New. > (vcmla_rot270_lane_f32): New. > (vcmla_rot270_laneq_f32): New. > (vcmlaq_rot270_lane_f32): New. > (vcmlaq_rot270_laneq_f32): New. > * config/arm/arm_neon_builtins.def (vcadd90, vcadd270, vcmla0, > vcmla90, > vcmla180, vcmla270, vcmla_lane0, vcmla_lane90, vcmla_lane180, > vcmla_lane270, > vcmla_laneq0, vcmla_laneq90, vcmla_laneq180, vcmla_laneq270, > vcmlaq_lane0, vcmlaq_lane90, vcmlaq_lane180, vcmlaq_lane270): > New. > * config/arm/neon.md (neon_vcmla_lane<rot><mode>, > neon_vcmla_laneq<rot><mode>, neon_vcmlaq_lane<rot><mode>): > New. > > gcc/testsuite/ChangeLog: > > 2018-12-11 Tamar Christina <tamar.christ...@arm.com> > > * gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: Add > AArch32 regexpr. > * gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c: > Likewise. > > --