| Issue |
162812
|
| Summary |
[x86] Using `select` instead of `pblendvb` leads to very poor codegen on AVX2
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
Sp00ph
|
I tried this code:
```ll
define <32 x i8> @cond_double_blendv(<32 x i8> %a, <32 x i8> %mask) {
%aa = add <32 x i8> %a, %a
%ret = call <32 x i8> @llvm.x86.avx2.pblendvb(<32 x i8> %a, <32 x i8> %aa, <32 x i8> %mask)
ret <32 x i8> %ret
}
define <32 x i8> @cond_double_select(<32 x i8> %a, <32 x i8> %mask) {
%aa = add <32 x i8> %a, %a
%bitmask = icmp slt <32 x i8> %mask, splat (i8 0)
%ret = select <32 x i1> %bitmask, <32 x i8> %aa, <32 x i8> %a
ret <32 x i8> %ret
}
```
Both functions have the same behavior, doubling lanes of `%a` if the MSB of the corresponding lane in `%b` is set. However, they generate wildly different assembly (using clang 21.1.0 with `-O3 -march=x86-64-v3`):
```asm
cond_double_blendv:
vpaddb ymm2, ymm0, ymm0
vpblendvb ymm0, ymm0, ymm2, ymm1
ret
.LCPI1_0:
.zero 32,252
.LCPI1_1:
.zero 32,32
cond_double_select:
vpsllw ymm2, ymm0, 2
vpand ymm2, ymm2, ymmword ptr [rip + .LCPI1_0]
vpsrlw ymm1, ymm1, 2
vpand ymm1, ymm1, ymmword ptr [rip + .LCPI1_1]
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
vpaddb ymm2, ymm0, ymm0
vpaddb ymm1, ymm1, ymm1
vpblendvb ymm0, ymm0, ymm2, ymm1
ret
```
The version using the `@llvm.x86.avx2.pblendvb` intrinsic emits the expected assembly. After staring at the version using `select` for a while, I can say that all instructions except `vpaddb ymm2, ymm0, ymm0` and the last `vpblendvb` form an elaborate no-op. I have no idea however what the code generator's intent was with these instructions. I don't see any reason why these functions should not just emit the exact same assembly.
Note: I originally encountered this while using AVX2 intrinsics in Rust, where the output from rustc was much worse than the output from clang for an equivalent function, with the difference being that rustc lowers `_mm256_blendv_epi8` to `icmp slt + select`, whereas clang lowers it to `call @llvm.x86.avx2.pblendvb`.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs