On 09/11/15 11:03, Bilyan Borisov wrote:
On 03/11/15 11:16, James Greenhalgh wrote:
On Fri, Oct 30, 2015 at 09:31:08AM +0000, Bilyan Borisov wrote:
In this patch from the series, all vmulx_lane variants have been
implemented as
a vdup followed by a vmulx. Existing implementations of intrinsics were
refactored to use this new approach.
Several new nameless md patterns are added that will enable the
combine pass to
pick up the dup/fmulx combination and replace it with a proper
fmulx[lane]
instruction.
In addition, test cases for all new intrinsics were added. Tested on
targets
aarch64-none-elf and aarch64_be-none-elf.
Hi,
I have a small style comment below.
gcc/
2015-XX-XX Bilyan Borisov <bilyan.bori...@arm.com>
* config/aarch64/arm_neon.h (vmulx_lane_f32): New.
(vmulx_lane_f64): New.
(vmulxq_lane_f32): Refactored & moved.
(vmulxq_lane_f64): Refactored & moved.
(vmulx_laneq_f32): New.
(vmulx_laneq_f64): New.
(vmulxq_laneq_f32): New.
(vmulxq_laneq_f64): New.
(vmulxs_lane_f32): New.
(vmulxs_laneq_f32): New.
(vmulxd_lane_f64): New.
(vmulxd_laneq_f64): New.
* config/aarch64/aarch64-simd.md (*aarch64_combine_dupfmulx1<mode>,
VDQSF): New pattern.
(*aarch64_combine_dupfmulx2<mode>, VDQF): New pattern.
(*aarch64_combine_dupfmulx3): New pattern.
(*aarch64_combine_vgetfmulx1<mode>, VDQF_DF): New pattern.
I'm not sure I like the use of 1,2,3 for this naming scheme.
Elsewhere in
the file, this convention points to the number of operands a pattern
requires (for example add<mode>3).
I think elsewhere in the file we use:
"*aarch64_mul3_elt<mode>"
"*aarch64_mul3_elt_<vswap_width_name><mode>"
"*aarch64_mul3_elt_to_128df"
"*aarch64_mul3_elt_to_64v2df"
Is there a reason not to follow that pattern?
Thanks,
James
Hi,
I've made the changes you've requested - the pattern names have been
changed to follow better the naming conventions elsewhere in the file.
Thanks,
Bilyan
Hi,
You can find the new updated Changelog for this patch below.
Thanks,
Bilyan
---
In this patch from the series, all vmulx_lane variants have been implemented as
a vdup followed by a vmulx. Existing implementations of intrinsics were
refactored to use this new approach.
Several new nameless md patterns are added that will enable the combine pass to
pick up the dup/fmulx combination and replace it with a proper fmulx[lane]
instruction.
In addition, test cases for all new intrinsics were added. Tested on targets
aarch64-none-elf and aarch64_be-none-elf.
gcc/
2015-XX-XX Bilyan Borisov <bilyan.bori...@arm.com>
* config/aarch64/arm_neon.h (vmulx_lane_f32): New.
(vmulx_lane_f64): Likewise.
(vmulxq_lane_f32): Refactored & moved.
(vmulxq_lane_f64): Likewise.
(vmulx_laneq_f32): New.
(vmulx_laneq_f64): Likewise.
(vmulxq_laneq_f32): Likewise.
(vmulxq_laneq_f64): Likewise.
(vmulxs_lane_f32): Likewise.
(vmulxs_laneq_f32): Likewise.
(vmulxd_lane_f64): Likewise.
(vmulxd_laneq_f64): Likewise.
* config/aarch64/aarch64-simd.md
(*aarch64_mulx_elt_<vswap_width_name><mode>, VDQSF): New pattern.
(*aarch64_mulx_elt<mode>, VDQF): Likewise.
(*aarch64_mulx_elt_to_64v2df): Likewise.
(*aarch64_vgetfmulx<mode>, VDQF_DF): Likewise.
gcc/testsuite/
2015-XX-XX Bilyan Borisov <bilyan.bori...@arm.com>
* gcc.target/aarch64/simd/vmulx_lane_f32_1.c: New.
* gcc.target/aarch64/simd/vmulx_lane_f64_1.c: New.
* gcc.target/aarch64/simd/vmulx_laneq_f32_1.c: New.
* gcc.target/aarch64/simd/vmulx_laneq_f64_1.c: New.
* gcc.target/aarch64/simd/vmulxq_lane_f32_1.c: New.
* gcc.target/aarch64/simd/vmulxq_lane_f64_1.c: New.
* gcc.target/aarch64/simd/vmulxq_laneq_f32_1.c: New.
* gcc.target/aarch64/simd/vmulxq_laneq_f64_1.c: New.
* gcc.target/aarch64/simd/vmulxs_lane_f32_1.c: New.
* gcc.target/aarch64/simd/vmulxs_laneq_f32_1.c: New.
* gcc.target/aarch64/simd/vmulxd_lane_f64_1.c: New.
* gcc.target/aarch64/simd/vmulxd_laneq_f64_1.c: New.