Issue 174912
Summary Redundent `mov` instructions with `mulx` on `x86-64-v4`
Labels new issue
Assignees
Reporter spencer3035
    There seems to be a missed optimization related to the `mulx` instruction for `mcpu=x86-64-v4`.

The following rust code is similar to the [`carrying_mul_add`](https://doc.rust-lang.org/std/primitive.u64.html#method.carrying_mul_add) intrinsic from the rust compiler.

```rust
/// Calculate `lhs * rhs + add + carry` and returns answer as (low 64 bits, high 64 bits)
pub fn carrying_mul_add(add: u64, carry: u64, lhs: u64, rhs: u64) -> (u64, u64) {
 let sum = (lhs as u128) * (rhs as u128) + add as u128 + carry as u128;
 (sum as u64, (sum >> 64) as u64)
}
```

Current LLVM assembly output: 

```asm
carrying_mul_add_u64:
    mov     rax, rdx
    mov     rdx, rcx
    mulx    rcx, rax, rax
    xor     edx, edx
    add     rdi, rsi
 setb    dl
    add     rax, rdi
    adc     rdx, rcx
    ret
```

The first two `mov` instructions are redundant and the rest will probably be simplified if it gets the first optimization. More optimized output (5 instructions instead of 8)

```asm
; Computes rcx * rdx + rsi + rdi
; Result is returned as 
; RDX high 64 bits
; RAX low 64 bits
carrying_mul_add_u64: 
    mulx rdx, rax, rcx 
    add rax, rdi 
 adc rdx, 0
    add rax, rsi 
    adc rdx, 0
    ret 
```

Example in [godbolt](https://rust.godbolt.org/z/Yx37TEbrf).

I assume this has something to do with `rdx` being used implicitly by `mulx` instead of explicitly.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to