The testcase for PR62178 has been failing for a while due to the pass conditions being too tight, resulting in failures with -mcmodel=tiny:
ldr q2, [x0], 124 ld1r {v1.4s}, [x1], 4 cmp x0, x2 mla v0.4s, v2.4s, v1.4s bne .L7 -mcmodel=small generates the slightly different: ldr q1, [x0], 124 ldr s2, [x1, 4]! cmp x0, x2 mla v0.4s, v1.4s, v2.s[0] bne .L7 This is due to Combine merging a DUP instruction with either a load or MLA - we can't force it to prefer one over the other. However the generated vector loop is fast either way since it generates MLA and merges the DUP either with a load or MLA. So relax the conditions slightly and check we still generate MLA and there is no DUP or FMOV. The testcase now passes - committed as obvious. ChangeLog 2018-11-14 Wilco Dijkstra <wdijk...@arm.com> testsuite/ * gcc.target/aarch64/pr62178.c: Relax scan-assembler checks. -- diff --git a/gcc/testsuite/gcc.target/aarch64/pr62178.c b/gcc/testsuite/gcc.target/aarch64/pr62178.c index ccb400fc9aee7a419287dc006918de3fb9d7da73..f50567ee61272e90b7b50bf8fa0962eecd6bb468 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr62178.c +++ b/gcc/testsuite/gcc.target/aarch64/pr62178.c @@ -16,6 +16,7 @@ void foo (void) { } } -/* { dg-final { scan-assembler "ldr\\ts\[0-9\]+, \\\[x\[0-9\]+, \[0-9\]+\\\]!" } } */ /* { dg-final { scan-assembler "ldr\\tq\[0-9\]+, \\\[x\[0-9\]+\\\], \[0-9\]+" } } */ -/* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.s\\\[0\\\]" } } */ +/* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+" } } */ +/* { dg-final { scan-assembler-not { dup } } } */ +/* { dg-final { scan-assembler-not { fmov } } } */