https://gcc.gnu.org/g:eeb5c6acf7198f057419b4d4bce34b58b12c2287
commit r15-5186-geeb5c6acf7198f057419b4d4bce34b58b12c2287 Author: Xianmiao Qu <cooper...@linux.alibaba.com> Date: Tue Nov 12 21:03:24 2024 -0700 [RISC-V] Fix costing of LO_SUM expressions This is a rewrite of a patch originally from Xianmiao Qu. Xianmiao noticed that the costs we compute for LO_SUM expressions was incorrect. Essentially we costed based solely on the first input to the LO_SUM. In a LO_SUM, the first input is almost always going to be a REG and thus isn't interesting. The second argument is almost always going to be some kind of symbolic operand, which is much more interesting from a costing standpoint. The right way to fix this is to sum the cost of the two operands. I've verified this produces the same code as Xianmiao's Qu's original patch. This has been tested on rv32 and rv64 in my tester. It missed today's bootstrap of riscv64 though :( Naturally I'll wait on the pre-commit CI tester to render a verdict, but I don't expect any problems. -- From Xianmiao Qu's original submission -- Currently, the cost of the LO_SUM expression is based on the cost of calculating the first subexpression. When the first subexpression is a register, the cost result will be zero. It seems a bit unreasonable for a SET expression to have a zero cost when its source is LO_SUM. Moreover, having a cost of zero for the expression will lead the loop invariant pass to calculate its benefits of being moved outside the loop as zero, thus preventing the out-of-loop placement of the loop invariant. As an example, consider the following test case: long a; long b[]; long *c; foo () { for (;;) *c = b[a]; } When compiling with -march=rv64gc -mabi=lp64d -Os, the following code is generated: .cfi_startproc lui a5,%hi(c) ld a4,%lo(c)(a5) lui a2,%hi(b) lui a1,%hi(a) .L2: ld a5,%lo(a)(a1) addi a3,a2,%lo(b) slli a5,a5,3 add a5,a5,a3 ld a5,0(a5) sd a5,0(a4) j .L2 After adjust the cost of the LO_SUM expression, the instruction addi will be moved outside the loop: .cfi_startproc lui a5,%hi(c) ld a3,%lo(c)(a5) lui a4,%hi(b) lui a2,%hi(a) addi a4,a4,%lo(b) .L2: ld a5,%lo(a)(a2) slli a5,a5,3 add a5,a5,a4 ld a5,0(a5) sd a5,0(a3) j .L2 gcc/ * config/riscv/riscv.cc (riscv_rtx_costs): Correct costing of LO_SUM expressions. Co-authored-by: Jeff Law <j...@ventanamicro.com> Diff: --- gcc/config/riscv/riscv.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 2e1e3a97eff0..6d64e4039577 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4023,7 +4023,8 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN return false; case LO_SUM: - *total = set_src_cost (XEXP (x, 0), mode, speed); + *total = (set_src_cost (XEXP (x, 0), mode, speed) + + set_src_cost (XEXP (x, 1), mode, speed)); return true; case LT: