| Issue |
172172
|
| Summary |
[missed-opt] [x86_64] Suboptimal movzx after inline assembly returning a byte
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
purplesyringa
|
Found while trying to implement a fast black-box primitive.
[Godbolt](https://godbolt.org/z/K3j7PKPK8)
```cpp
char f(char x) {
asm("nop" : "+r"(x));
return x * 3;
}
short g(short x) {
asm("nop" : "+r"(x));
return x * 3;
}
char h(char x) {
return x * 3;
}
```
```asm
f(char):
nop
movzx eax, dil
lea eax, [rax + 2*rax]
ret
g(short):
nop
lea eax, [rdi + 2*rdi]
ret
h(char):
lea eax, [rdi + 2*rdi]
ret
```
The line `movzx eax, dil` in `f` can be omitted (and, indeed, GCC omits it). I initially thought this was some kind of dependency-breaking optimization, but I'm not sure anymore. For one thing, it's not done for 16-bit numbers (`g`), which would seemingly suffer from the same issue. It is also not done in `h`, where the input to `lea` is the function argument, which by psABI has undefined top bits. If this is an optimization attempt, it seems more like a pessimization after inline assembly, which the author supposedly made as efficient as possible, and there's no way to opt out of the zero-extenion.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs