Hi Tamar,

On 11/12/18 15:46, Tamar Christina wrote:
Hi All,

This patch adds NEON intrinsics and tests for the Armv8.3-a complex
multiplication and add instructions with a rotate along the Argand plane.

The instructions are documented in the ArmARM[1] and the intrinsics 
specification
will be published on the Arm website [2].

The Lane versions of these instructions are special in that they always select 
a pair.
using index 0 means selecting lane 0 and 1.  Because of this the range check 
for the
intrinsics require special handling.

On Arm, in order to implement some of the lane intrinsics we're using the 
structure of the
register file.  The lane variant of these instructions always select a D 
register, but the data
itself can be stored in Q registers.  This means that for single precision 
complex numbers you are
only allowed to select D[0] but using the register file layout you can get the 
range 0-1 for lane indices
by selecting between Dn[0] and Dn+1[0].

Same reasoning applies for half float complex numbers, except there your D 
register indexes can be 0 or 1, so you have
a total range of 4 elements (for a V8HF).


[1] 
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
[2] https://developer.arm.com/docs/101028/latest

Bootstrapped Regtested on arm-none-gnueabihf and no issues.

Ok for trunk?


Ok.
Thanks,
Kyrill

Thanks,
Tamar

gcc/ChangeLog:

2018-12-11  Tamar Christina  <tamar.christ...@arm.com>

        * config/arm/arm-builtins.c
        (enum arm_type_qualifiers): Add qualifier_lane_pair_index.
        (MAC_LANE_PAIR_QUALIFIERS): New.
        (arm_expand_builtin_args): Use it.
        (arm_expand_builtin_1): Likewise.
        * config/arm/arm-protos.h (neon_vcmla_lane_prepare_operands): New.
        * config/arm/arm.c (neon_vcmla_lane_prepare_operands): New.
        * config/arm/arm-c.c (arm_cpu_builtins): Add __ARM_FEATURE_COMPLEX.
        * config/arm/arm_neon.h:
        (vcadd_rot90_f16): New.
        (vcaddq_rot90_f16): New.
        (vcadd_rot270_f16): New.
        (vcaddq_rot270_f16): New.
        (vcmla_f16): New.
        (vcmlaq_f16): New.
        (vcmla_lane_f16): New.
        (vcmla_laneq_f16): New.
        (vcmlaq_lane_f16): New.
        (vcmlaq_laneq_f16): New.
        (vcmla_rot90_f16): New.
        (vcmlaq_rot90_f16): New.
        (vcmla_rot90_lane_f16): New.
        (vcmla_rot90_laneq_f16): New.
        (vcmlaq_rot90_lane_f16): New.
        (vcmlaq_rot90_laneq_f16): New.
        (vcmla_rot180_f16): New.
        (vcmlaq_rot180_f16): New.
        (vcmla_rot180_lane_f16): New.
        (vcmla_rot180_laneq_f16): New.
        (vcmlaq_rot180_lane_f16): New.
        (vcmlaq_rot180_laneq_f16): New.
        (vcmla_rot270_f16): New.
        (vcmlaq_rot270_f16): New.
        (vcmla_rot270_lane_f16): New.
        (vcmla_rot270_laneq_f16): New.
        (vcmlaq_rot270_lane_f16): New.
        (vcmlaq_rot270_laneq_f16): New.
        (vcadd_rot90_f32): New.
        (vcaddq_rot90_f32): New.
        (vcadd_rot270_f32): New.
        (vcaddq_rot270_f32): New.
        (vcmla_f32): New.
        (vcmlaq_f32): New.
        (vcmla_lane_f32): New.
        (vcmla_laneq_f32): New.
        (vcmlaq_lane_f32): New.
        (vcmlaq_laneq_f32): New.
        (vcmla_rot90_f32): New.
        (vcmlaq_rot90_f32): New.
        (vcmla_rot90_lane_f32): New.
        (vcmla_rot90_laneq_f32): New.
        (vcmlaq_rot90_lane_f32): New.
        (vcmlaq_rot90_laneq_f32): New.
        (vcmla_rot180_f32): New.
        (vcmlaq_rot180_f32): New.
        (vcmla_rot180_lane_f32): New.
        (vcmla_rot180_laneq_f32): New.
        (vcmlaq_rot180_lane_f32): New.
        (vcmlaq_rot180_laneq_f32): New.
        (vcmla_rot270_f32): New.
        (vcmlaq_rot270_f32): New.
        (vcmla_rot270_lane_f32): New.
        (vcmla_rot270_laneq_f32): New.
        (vcmlaq_rot270_lane_f32): New.
        (vcmlaq_rot270_laneq_f32): New.
        * config/arm/arm_neon_builtins.def (vcadd90, vcadd270, vcmla0, vcmla90,
        vcmla180, vcmla270, vcmla_lane0, vcmla_lane90, vcmla_lane180, 
vcmla_lane270,
        vcmla_laneq0, vcmla_laneq90, vcmla_laneq180, vcmla_laneq270,
        vcmlaq_lane0, vcmlaq_lane90, vcmlaq_lane180, vcmlaq_lane270): New.
        * config/arm/neon.md (neon_vcmla_lane<rot><mode>,
        neon_vcmla_laneq<rot><mode>, neon_vcmlaq_lane<rot><mode>): New.

gcc/testsuite/ChangeLog:

2018-12-11  Tamar Christina  <tamar.christ...@arm.com>

        * gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: Add AArch32 
regexpr.
        * gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c: Likewise.

--

Reply via email to