| Issue |
174912
|
| Summary |
Redundent `mov` instructions with `mulx` on `x86-64-v4`
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
spencer3035
|
There seems to be a missed optimization related to the `mulx` instruction for `mcpu=x86-64-v4`.
The following rust code is similar to the [`carrying_mul_add`](https://doc.rust-lang.org/std/primitive.u64.html#method.carrying_mul_add) intrinsic from the rust compiler.
```rust
/// Calculate `lhs * rhs + add + carry` and returns answer as (low 64 bits, high 64 bits)
pub fn carrying_mul_add(add: u64, carry: u64, lhs: u64, rhs: u64) -> (u64, u64) {
let sum = (lhs as u128) * (rhs as u128) + add as u128 + carry as u128;
(sum as u64, (sum >> 64) as u64)
}
```
Current LLVM assembly output:
```asm
carrying_mul_add_u64:
mov rax, rdx
mov rdx, rcx
mulx rcx, rax, rax
xor edx, edx
add rdi, rsi
setb dl
add rax, rdi
adc rdx, rcx
ret
```
The first two `mov` instructions are redundant and the rest will probably be simplified if it gets the first optimization. More optimized output (5 instructions instead of 8)
```asm
; Computes rcx * rdx + rsi + rdi
; Result is returned as
; RDX high 64 bits
; RAX low 64 bits
carrying_mul_add_u64:
mulx rdx, rax, rcx
add rax, rdi
adc rdx, 0
add rax, rsi
adc rdx, 0
ret
```
Example in [godbolt](https://rust.godbolt.org/z/Yx37TEbrf).
I assume this has something to do with `rdx` being used implicitly by `mulx` instead of explicitly.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs