Issue 129750
Summary Missed optimization: eager spills mess up hot path
Labels new issue
Assignees
Reporter travisdowns
    Consider the following function:

```
[[noreturn]] [[gnu::cold]]
void cold_function(const int& x, const int& y);
int hot_function(int x, int y) {
    if (x < y) [[unlikely]] {
        cold_function(x, y);
    }
    return x + y;
}
```

In clang++ this generates the following code at -O3:

```
hot_function(int, int):
  push rax
  mov dword ptr [rsp + 4], edi
  mov dword ptr [rsp], esi
  cmp edi, esi
  jl .LBB0_2
  add esi, edi
  mov eax, esi
  pop rcx
  ret
.LBB0_2:
  lea rdi, [rsp + 4]
  mov rsi, rsp
  call cold_function(int const&, int const&)@PLT
```

However the whole spilling of the in-register variables, and the alignment of the stack frame (`push rax`) could be deferred to the cold branch instead:

```
hot_function(int, int):
  cmp edi, esi
  jl .LBB0_2
  add esi, edi
  mov eax, esi
  ret
.LBB0_2:
  push rax
  mov dword ptr [rsp + 4], edi
  mov dword ptr [rsp], esi
  lea rdi, [rsp + 4]
  mov rsi, rsp
 call cold_function(int const&, int const&)@PLT
```

Cutting the hot path almost in half and avoiding an expensive store-forwarding stall (`pop rax` reads the qword at `[rsp]` which was immediately before written in two dword halves during the spill, this causes an expensive (~10ish cycles) stall on all modern big cores I'm aware of).

https://godbolt.org/z/nTvnj4r1r
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to