Hi all, The aarch64_vmls<mode> pattern claims to perform a normal vector floating-point multiply-subtract but in fact performs a fused multiply-subtract. This is fine when -ffp-contract=fast, but it's not guarded on anything so will generate the FMLS instruction even when -ffp-contract=off.
The solution is just to delete the pattern. If -ffp-contract=fast then an fma operation will have been generated and the fnma<mode>4 would be used to generate the FMLS instruction. Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk and GCC 6 and 5? GCC 4.9 needs a different -mtune option in the testcase to trigger the testcase... Thanks, Kyrill 2016-05-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> PR target/70809 * config/aarch64/aarch64-simd.md (aarch64_vmls<mode>): Delete. 2016-05-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> PR target/70809 * gcc.target/aarch64/pr70809_1.c: New test.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index a66948a28e99f4437824a8640b092f7be1c917f6..90272a09f2dd925cfc01caa09e9e8963a8e6c6ed 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1919,16 +1919,6 @@ (define_expand "vec_pack_trunc_df" } ) -(define_insn "aarch64_vmls<mode>" - [(set (match_operand:VDQF 0 "register_operand" "=w") - (minus:VDQF (match_operand:VDQF 1 "register_operand" "0") - (mult:VDQF (match_operand:VDQF 2 "register_operand" "w") - (match_operand:VDQF 3 "register_operand" "w"))))] - "TARGET_SIMD" - "fmls\\t%0.<Vtype>, %2.<Vtype>, %3.<Vtype>" - [(set_attr "type" "neon_fp_mla_<Vetype>_scalar<q>")] -) - ;; FP Max/Min ;; Max/Min are introduced by idiom recognition by GCC's mid-end. An ;; expression like: diff --git a/gcc/testsuite/gcc.target/aarch64/pr70809_1.c b/gcc/testsuite/gcc.target/aarch64/pr70809_1.c new file mode 100644 index 0000000000000000000000000000000000000000..df88c71c42afc7fafff703f801bbfced8daafc95 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr70809_1.c @@ -0,0 +1,18 @@ +/* PR target/70809. */ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -ffp-contract=off -mtune=xgene1" } */ + +/* Check that vector FMLS is not generated when contraction is disabled. */ + +void +foo (float *__restrict__ __attribute__ ((aligned (16))) a, + float *__restrict__ __attribute__ ((aligned (16))) x, + float *__restrict__ __attribute__ ((aligned (16))) y, + float *__restrict__ __attribute__ ((aligned (16))) z) +{ + unsigned i = 0; + for (i = 0; i < 256; i++) + a[i] = x[i] - (y[i] * z[i]); +} + +/* { dg-final { scan-assembler-not "fmls\tv.*" } } */