This test has been failing since r15-1619-g3b9b8d6cfdf593, which made
IRA prefer a call-clobbered register over a call-preserved register
for mem1 (the second load).  In this particular case, that just
forces the variable p3 to be allocated to a call-preserved register
instead, leading to an extra predicate move from p3 to that register.

However, it was really pot luck that this worked before.  Each argument
is used exactly once, so there isn't an obvious colouring order.
And mem0 and mem1 are passed by indirect reference, so they are not
REG_EQUIV to a stack slot in the way that some memory arguments are.

IIRC, the test was the result of some experimentation, and so I think
the best fix is to rework it to try to make it less sensitive to RA
decisions.  This patch does that by enabling scheduling for the
function and using both memory arguments in the same instruction.
This gets rid of the distracting prologue and epilogue code and
restricts the test to the PCS parts.

Tested on aarch64-linux-gnu.  I'll leave a day or so for comments
before pushing.

Richard


gcc/testsuite/
        PR testsuite/116604
        * gcc.target/aarch64/sve/pcs/args_1.c (callee_pred): Enable scheduling
        and use both memory arguments in the same instruction.  Expect no
        prologue and epilogue code.
---
 .../gcc.target/aarch64/sve/pcs/args_1.c       | 31 ++++++++++---------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c
index 6deca329599..b020a043523 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_1.c
@@ -1,31 +1,32 @@
 /* { dg-do compile } */
-/* { dg-options "-O -g" } */
+/* { dg-options "-O -mtune=generic -g" } */
 /* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
 
 #include <arm_sve.h>
 
 /*
 ** callee_pred:
-**     addvl   sp, sp, #-1
-**     str     p[0-9]+, \[sp\]
-**     str     p[0-9]+, \[sp, #1, mul vl\]
-**     ldr     (p[0-9]+), \[x0\]
-**     ldr     (p[0-9]+), \[x1\]
-**     brkpa   (p[0-7])\.b, p0/z, p1\.b, p2\.b
-**     brkpb   (p[0-7])\.b, \3/z, p3\.b, \1\.b
-**     brka    p0\.b, \4/z, \2\.b
-**     ldr     p[0-9]+, \[sp\]
-**     ldr     p[0-9]+, \[sp, #1, mul vl\]
-**     addvl   sp, sp, #1
+**     brkpa   (p[0-3])\.b, p0/z, p1\.b, p2\.b
+** (
+**     ldr     (p[0-3]), \[x0\]
+**     ldr     (p[0-3]), \[x1\]
+**     brkpb   (p[0-3])\.b, \1/z, \2\.b, \3\.b
+**     brka    p0\.b, \4/z, p3\.b
+** |
+**     ldr     (p[0-3]), \[x1\]
+**     ldr     (p[0-3]), \[x0\]
+**     brkpb   (p[0-3])\.b, \1/z, \6\.b, \5\.b
+**     brka    p0\.b, \7/z, p3\.b
+** )
 **     ret
 */
-__SVBool_t __attribute__((noipa))
+__SVBool_t __attribute__((noipa, optimize("schedule-insns")))
 callee_pred (__SVBool_t p0, __SVBool_t p1, __SVBool_t p2, __SVBool_t p3,
             __SVBool_t mem0, __SVBool_t mem1)
 {
   p0 = svbrkpa_z (p0, p1, p2);
-  p0 = svbrkpb_z (p0, p3, mem0);
-  return svbrka_z (p0, mem1);
+  p0 = svbrkpb_z (p0, mem0, mem1);
+  return svbrka_z (p0, p3);
 }
 
 /*
-- 
2.25.1

Reply via email to