On 27/05/16 14:42, James Greenhalgh wrote:
On Tue, May 24, 2016 at 09:24:03AM +0100, Jiong Wang wrote:
These intrinsics was implemented by inline assembly using "faddp"
instruction.
There was a pattern "aarch64_addpv4sf" which supportsV4SF mode only
while we can
extend this pattern to support VDQF mode, then we can reimplement these
intrinsics through builtlins.

gcc/
2016-05-23  Jiong Wang <jiong.w...@arm.com>

         * config/aarch64/aarch64-builtins.def (faddp): New builtins
for modes in VDQF.
         * config/aarch64/aarch64-simd.md (aarch64_faddp<mode>): New.
         (arch64_addpv4sf): Delete.
         (reduc_plus_scal_v4sf): Use "gen_aarch64_faddpv4sf" instead of
         "gen_aarch64_addpv4sf".
         * gcc/config/aarch64/iterators.md (UNSPEC_FADDP): New.
         * config/aarch64/arm_neon.h (vpadd_f32): Remove inline
assembly.  Use
         builtin.
         (vpaddq_f32): Likewise.
         (vpaddq_f64): Likewise.
This ChangeLog format is incorrect.

You've missed vpaddd_f64 and vpadds_f32, could you add those?

vpaddd_f64 is already there without inline assembly.


This patch cleans up those intrinsics with symmetric vector input and output. vpadds_f32 looks to me is doing reduce job the return value is scalar instead of vector thus can't fit well by the touched pattern. I can clean it up with a seperate patch. Is this OK?



Thanks,
James


Reply via email to