Issue |
124932
|
Summary |
[IPRA][RISCV] ra save/restore unexpectedly optimised out of function
|
Labels |
new issue
|
Assignees |
|
Reporter |
mikhailramalho
|
This was found as part of an effort to evaluate if it would be beneficial to enable IPRA for RISC-V. The issue can be seen in SPEC (built with a hack to disable the MachineOutliner to workaround #119556), but a reduced example from the GCC torture suite is used in this bug report.
Tested ipra on top of `99bd2e3f123baf9a14acc9b31ee0f557288118a6` (28th Jan)
A reduced example of the problem based on 20090113-3.c in the GCC torture suite is below:
```
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"
%struct.bitmap_iterator = type { ptr, ptr, i32, i64 }
define i32 @main() nounwind {
entry:
call void @foobar(ptr null)
ret i32 0
}
define internal void @foobar(ptr %live_throughout.0.val) norecurse nounwind {
entry:
%rsi = alloca %struct.bitmap_iterator, align 8
%regno = alloca i32, i32 0, align 4
call void @bmp_iter_set_init(ptr %rsi, ptr %live_throughout.0.val, ptr %regno)
ret void
}
declare void @bmp_iter_set_init(ptr, ptr, ptr)
```
Removing the norecurse attribute from foobar "fixes" the bad compile.
Reproduce with `llc -enable-ipra < tc.ll`
foobar ends up with liveins: `$x10` for the bad compile but liveins: `$x10, $x1` for the good one, and the regmask for the call to foobar doesn't have `$x1` in it for the bad compile.
The generated assembly for `-enable-ipra` (note that `foobar` calls another function but doesn't save/restore `ra` around it):
```
.attribute 4, 16
.attribute 5, "rv64i2p1"
.file "<stdin>"
.text
.globl main # -- Begin function main
.p2align 2
.type main,@function
main: # @main
# %bb.0: # %entry
addi sp, sp, -16
sd ra, 8(sp) # 8-byte Folded Spill
li a0, 0
call foobar
li a0, 0
ld ra, 8(sp) # 8-byte Folded Reload
addi sp, sp, 16
ret
.Lfunc_end0:
.size main, .Lfunc_end0-main
# -- End function
.p2align 2 # -- Begin function foobar
.type foobar,@function
foobar: # @foobar
# %bb.0: # %entry
addi sp, sp, -48
mv a1, a0
addi a0, sp, 16
addi a2, sp, 12
call bmp_iter_set_init
addi sp, sp, 48
ret
.Lfunc_end1:
.size foobar, .Lfunc_end1-foobar
# -- End function
.section ".note.GNU-stack","",@progbits
```
The generated assembly without `-enable-ipra` (note the save/restore of `ra` in `foobar`):
```
.attribute 4, 16
.attribute 5, "rv64i2p1"
.file "<stdin>"
.text
.globl main # -- Begin function main
.p2align 2
.type main,@function
main: # @main
# %bb.0: # %entry
addi sp, sp, -16
sd ra, 8(sp) # 8-byte Folded Spill
li a0, 0
call foobar
li a0, 0
ld ra, 8(sp) # 8-byte Folded Reload
addi sp, sp, 16
ret
.Lfunc_end0:
.size main, .Lfunc_end0-main
# -- End function
.p2align 2 # -- Begin function foobar
.type foobar,@function
foobar: # @foobar
# %bb.0: # %entry
addi sp, sp, -48
sd ra, 40(sp) # 8-byte Folded Spill
mv a1, a0
addi a0, sp, 8
addi a2, sp, 4
call bmp_iter_set_init
ld ra, 40(sp) # 8-byte Folded Reload
addi sp, sp, 48
ret
.Lfunc_end1:
.size foobar, .Lfunc_end1-foobar
# -- End function
.section ".note.GNU-stack","",@progbits
```
A similar problem is found in SPEC's `557.xz_r` - in the `match` function, no registers are callee-saved including the return address and the return address is used internally.
Filing as a bug at this point as I know @topperc was looking in this sort of area, though am happy to dig further if nothing immediately jumps to mind.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs