Issue |
129899
|
Summary |
s390x: vector cast using shuffle does not optimize well
|
Labels |
new issue
|
Assignees |
|
Reporter |
folkertdev
|
https://godbolt.org/z/6sodYY3fW
This LLVM IR
```llvm
define range(i64 -128, 128) <2 x i64> @manual_vec_extend_s64(<16 x i8> %a) unnamed_addr {
start:
%0 = shufflevector <16 x i8> %a, <16 x i8> poison, <2 x i32> <i32 7, i32 15>
%1 = sext <2 x i8> %0 to <2 x i64>
ret <2 x i64> %1
}
```
does not optimize to a single instruction.
The C code uses a slightly different (more manual) lowering to LLVM IR:
https://godbolt.org/z/aencTa3nq
```llvm
define dso_local <2 x i64> @a(<16 x i8> noundef %a) local_unnamed_addr {
entry:
%vecext.i = extractelement <16 x i8> %a, i64 7
%conv.i = sext i8 %vecext.i to i64
%vecinit.i = insertelement <2 x i64> poison, i64 %conv.i, i64 0
%vecext1.i = extractelement <16 x i8> %a, i64 15
%conv2.i = sext i8 %vecext1.i to i64
%vecinit3.i = insertelement <2 x i64> %vecinit.i, i64 %conv2.i, i64 1
ret <2 x i64> %vecinit3.i
}
```
but rust can't replicate that at the moment. That's a bug we'll fix on the rust side, but still I think the `shufflevector` should also work. And it seems like it might be smaller in LLVM IR and hence preferred for efficiency as the clang lowering as well?
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs