Issue 130376
Summary Missed use of movk on aarch64 when performing disjoint OR with shifted 16 bit immediate
Labels new issue
Assignees
Reporter neildhar
    I've observed cases where clang emits a suboptimal sequence of `mov` + `orr` or when a `movk` would suffice:

The most straightforward case is as follows:
```
uint64_t foo(uint32_t* raw){
 // load
    uint64_t res = *raw;
    // movk
    res |= (uint64_t)0xfffd << 48;
    // ret
    return res;
}
```

For which clang emits:
```
        ldr     w8, [x0]
        mov     x9, #-844424930131968
        orr     x0, x8, x9
        ret
```

The `mov` + `orr` could instead be a single `movk`.

A related (but as far as I can tell, distinct) case:
```
uint64_t bar(uint32_t* raw){
    // load
 uint64_t res = *raw;
    // movk
    res |= (uint64_t)0xfffd << 48;
    // asr
    res = (int64_t)res >> 3;
    // ret
    return res;
}
```

For which clang emits:
```
        ldr     w8, [x0]
        mov     x9, #175921860444160
        movk    x9, #65535, lsl #48
        orr     x0, x9, x8, lsr #3
        ret
```

Based on a quick scan of the resulting IR, the additional consideration here is that the arithmetic shift is turned into a logical shift (and the constant is made correspondingly larger) before getting to ISel.

https://godbolt.org/z/h95E3zhhd
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to