Issue |
129750
|
Summary |
Missed optimization: eager spills mess up hot path
|
Labels |
new issue
|
Assignees |
|
Reporter |
travisdowns
|
Consider the following function:
```
[[noreturn]] [[gnu::cold]]
void cold_function(const int& x, const int& y);
int hot_function(int x, int y) {
if (x < y) [[unlikely]] {
cold_function(x, y);
}
return x + y;
}
```
In clang++ this generates the following code at -O3:
```
hot_function(int, int):
push rax
mov dword ptr [rsp + 4], edi
mov dword ptr [rsp], esi
cmp edi, esi
jl .LBB0_2
add esi, edi
mov eax, esi
pop rcx
ret
.LBB0_2:
lea rdi, [rsp + 4]
mov rsi, rsp
call cold_function(int const&, int const&)@PLT
```
However the whole spilling of the in-register variables, and the alignment of the stack frame (`push rax`) could be deferred to the cold branch instead:
```
hot_function(int, int):
cmp edi, esi
jl .LBB0_2
add esi, edi
mov eax, esi
ret
.LBB0_2:
push rax
mov dword ptr [rsp + 4], edi
mov dword ptr [rsp], esi
lea rdi, [rsp + 4]
mov rsi, rsp
call cold_function(int const&, int const&)@PLT
```
Cutting the hot path almost in half and avoiding an expensive store-forwarding stall (`pop rax` reads the qword at `[rsp]` which was immediately before written in two dword halves during the spill, this causes an expensive (~10ish cycles) stall on all modern big cores I'm aware of).
https://godbolt.org/z/nTvnj4r1r
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs