Issue 137086
Summary AArch64: uses store from GP reg where vectorized reg would be better
Labels backend:AArch64
Assignees
Reporter MatzeB
    Repro:
```
#include <string.h>
#include <arm_neon.h>
#include <arm_sve.h>

void foobar(uint16_t *p, uint8_t *p2, uint64x2_t vec, int x) {
  vec += (uint64x2_t){3, 4};
  if (x) {
    asm volatile(""); // side-effect so conditional move cannot be used.
    vec |= (uint64x2_t){0x12, 0x34};
 *p2 = vec[1];
    *p = vec[0];
  } else {
    // Commenting out the next two instructions will improve code in the "then" branch!
    vec ^= (uint64x2_t){0x34, 0x12};
    *p = vec[0];
  }
}

// clang++ -target aarch64-redhat-linux-gnu -O3 -S -o - test.cpp -march=armv9-a+sve2+fp16
```

The compiler ends up compiling the "then" part of the branch as:
```
...
       umov    w8, v0.h[0]
       st1 { v0.b }[8], [x1]
       strh    w8, [x0]
```
while this could be done without going throgh the w8 GP register:
```
       st1     { v0.b }[8], [x1]
       str     h0, [x0]
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to