Issue |
137086
|
Summary |
AArch64: uses store from GP reg where vectorized reg would be better
|
Labels |
backend:AArch64
|
Assignees |
|
Reporter |
MatzeB
|
Repro:
```
#include <string.h>
#include <arm_neon.h>
#include <arm_sve.h>
void foobar(uint16_t *p, uint8_t *p2, uint64x2_t vec, int x) {
vec += (uint64x2_t){3, 4};
if (x) {
asm volatile(""); // side-effect so conditional move cannot be used.
vec |= (uint64x2_t){0x12, 0x34};
*p2 = vec[1];
*p = vec[0];
} else {
// Commenting out the next two instructions will improve code in the "then" branch!
vec ^= (uint64x2_t){0x34, 0x12};
*p = vec[0];
}
}
// clang++ -target aarch64-redhat-linux-gnu -O3 -S -o - test.cpp -march=armv9-a+sve2+fp16
```
The compiler ends up compiling the "then" part of the branch as:
```
...
umov w8, v0.h[0]
st1 { v0.b }[8], [x1]
strh w8, [x0]
```
while this could be done without going throgh the w8 GP register:
```
st1 { v0.b }[8], [x1]
str h0, [x0]
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs