Issue |
136690
|
Summary |
[X86][AVX2][ISEL] Unnecessary masking and comparison instructions generated for movmsk
|
Labels |
new issue
|
Assignees |
|
Reporter |
nurmukhametov
|
The following LLVM IR:
```llvm
%v1_l = load <4 x float>, ptr %Source
%v1 = shufflevector <4 x float> %v1_l, <4 x float> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
%4 = bitcast <8 x float> %v1 to <8 x i32>
%5 = and <8 x i32> %4, <i32 -2147483648, i32 -2147483648, i32 -2147483648, i32 -2147483648, i32 poison, i32 poison, i32 poison, i32 poison>
%bitop.i.i = shufflevector <8 x i32> %5, <8 x i32> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 poison, i32 poison, i32 poison, i32 poison>
%calltmp_to_bool.i = icmp ne <8 x i32> %bitop.i.i, zeroinitializer
%6 = bitcast <8 x i1> %calltmp_to_bool.i to i8
%calltmp4_to_int32.i = zext i8 %6 to i32
store i32 %calltmp4_to_int32.i, ptr %Result, align 4
```
is code-generated to the following assembly (--mcpu=haswell):
```asm
vmovdqa xmm0, xmmword ptr [rsi]
vpbroadcastd xmm1, dword ptr [rip + .LCPI0_0] # xmm1 = [2147483648,2147483648,2147483648,2147483648]
vpand ymm0, ymm0, ymm1
vpcmpeqd ymm0, ymm0, ymm1
vmovmskps eax, ymm0
mov dword ptr [rdi], eax
```
although it seems to me that it can be just:
```asm
vmovaps xmm0, xmmword ptr [rsi]
vmovmskps eax, xmm0
mov dword ptr [rdi], eax
```
For example, when we AND with <i32 -2147483648, i32 -2147483648, i32 -2147483648, i32 -2147483648, i32 0, i32 0, i32 0, i32 0> (zeros instead of poisons), ISEL manages to pick just vmovups and vmovmskps.
Compiler explorer link: https://godbolt.org/z/8qnfnbTsr
Am I missing something here?
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs