The testcase for PR62178 has been failing for a while due to the pass
conditions being too tight, resulting in failures with -mcmodel=tiny:

        ldr     q2, [x0], 124
        ld1r    {v1.4s}, [x1], 4
        cmp     x0, x2
        mla     v0.4s, v2.4s, v1.4s
        bne     .L7

-mcmodel=small generates the slightly different:

        ldr     q1, [x0], 124
        ldr     s2, [x1, 4]!
        cmp     x0, x2
        mla     v0.4s, v1.4s, v2.s[0]
        bne     .L7

This is due to Combine merging a DUP instruction with either a load
or MLA - we can't force it to prefer one over the other.  However the
generated vector loop is fast either way since it generates MLA and
merges the DUP either with a load or MLA.  So relax the conditions
slightly and check we still generate MLA and there is no DUP or FMOV.

The testcase now passes - committed as obvious.

ChangeLog
2018-11-14  Wilco Dijkstra  <wdijk...@arm.com>  

    testsuite/
        * gcc.target/aarch64/pr62178.c: Relax scan-assembler checks.

--

diff --git a/gcc/testsuite/gcc.target/aarch64/pr62178.c 
b/gcc/testsuite/gcc.target/aarch64/pr62178.c
index 
ccb400fc9aee7a419287dc006918de3fb9d7da73..f50567ee61272e90b7b50bf8fa0962eecd6bb468
 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr62178.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr62178.c
@@ -16,6 +16,7 @@ void foo (void) {
     }
 }
 
-/* { dg-final { scan-assembler "ldr\\ts\[0-9\]+, \\\[x\[0-9\]+, \[0-9\]+\\\]!" 
} } */
 /* { dg-final { scan-assembler "ldr\\tq\[0-9\]+, \\\[x\[0-9\]+\\\], \[0-9\]+" 
} } */
-/* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, 
v\[0-9\]+\.s\\\[0\\\]" } } */
+/* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+" 
} } */
+/* { dg-final { scan-assembler-not { dup } } } */
+/* { dg-final { scan-assembler-not { fmov } } } */

Reply via email to