It seems the new expander triggers a latent issue in sched1 causing extraneous spills in a different sad variant. Given how close we are to gcc-15 release, disable it for now.
Since we do want to retain and re-enable this capabilty, manully disable vs. reverting the orig patch which takes away the test case too. Fix the orig test case to expect old codegen idiom (although vneg is no longer emitted, in favor of vrsub). Also add a new testcase which flags any future spills in the affected routine. PR target/119224 gcc/ChangeLog: * config/riscv/autovec.md: Disable abd splitter. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr117722.c: Adjust output insn. * gcc.target/riscv/rvv/autovec/pr119224.c: Add new test. Signed-off-by: Vineet Gupta <vine...@rivosinc.com> --- gcc/config/riscv/autovec.md | 3 ++- .../gcc.target/riscv/rvv/autovec/pr117722.c | 6 ++--- .../gcc.target/riscv/rvv/autovec/pr119224.c | 27 +++++++++++++++++++ 3 files changed, 32 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index c7f12f9e36f5..f53ed3a5e3fd 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2975,7 +2975,8 @@ (define_expand "uabd<mode>3" [(match_operand:V_VLSI 0 "register_operand") (match_operand:V_VLSI 1 "register_operand") (match_operand:V_VLSI 2 "register_operand")] - "TARGET_VECTOR" + ;; Disabled until PR119224 is resolved + "TARGET_VECTOR && 0" { rtx max = gen_reg_rtx (<MODE>mode); insn_code icode = code_for_pred (UMAX, <MODE>mode); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c index f255ceb2cee6..493dab056212 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c @@ -18,6 +18,6 @@ int pixel_sad_n(unsigned char *pix1, unsigned char *pix2, int n) return sum; } -/* { dg-final { scan-assembler {vminu\.v} } } */ -/* { dg-final { scan-assembler {vmaxu\.v} } } */ -/* { dg-final { scan-assembler {vsub\.v} } } */ +/* { dg-final { scan-assembler {vrsub\.v} } } */ +/* { dg-final { scan-assembler {vmax\.v} } } */ +/* { dg-final { scan-assembler {vwsubu\.v} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c new file mode 100644 index 000000000000..fa3386c345b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ffast-math -march=rv64gcv_zvl256b -mabi=lp64d -mtune=generic-ooo -mrvv-vector-bits=zvl" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-O2" "-Og" "-Os" "-Oz" } } */ + +/* A core routine of x264 which should not spill for OoO VLS build. */ + +inline int abs(int i) +{ + return (i < 0 ? -i : i); +} + +int x264_sad_16x16(unsigned char *p1, int st1, unsigned char *p2, int st2) +{ + int sum = 0; + + for(int y = 0; y < 16; y++) + { + for(int x = 0; x < 16; x++) + sum += abs (p1[x] - p2[x]); + p1 += st1; p2 += st2; + } + + return sum; +} + +/* { dg-final { scan-assembler-not {addi\t[a-x0-9]+,sp} } } */ +/* { dg-final { scan-assembler-not {addi\tsp,sp} } } */ -- 2.43.0