Issue 137040
Summary AArch64 SVE: Multiple ptrue instructions not merged
Labels backend:AArch64
Assignees
Reporter MatzeB
    SVE code operating with different vector types has a tendency to produce multiple `ptrue` instructions.

For example I got this from an internal user:
```
#include <arm_sve.h>
svuint32_t getSveVec(const uint32_t* inputPtr) {
    svuint64_t vec = svld1uw_u64(svptrue_b64(), inputPtr);
 svuint32_t clzV1 = svclz_u32_x(svptrue_b32(), svreinterpret_u32_u64(vec));
 return clzV1;
}
```
Producing something like this:
```
$ clang++ -target aarch64-redhat-linux-gnu -march=armv9-a+sve2+fp16 -O3 -S -o - dup2.cpp
...
        ptrue   p0.d
        ld1w    { z0.d }, p0/z, [x0]
 ptrue   p0.s
        clz     z0.s, p0/m, z0.s
        ret
```
My understanding is that a `ptrue p0.b` would suffice here and in fact GCC is producing that code.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to