| Issue |
174871
|
| Summary |
[X86] X86CompressEVEX: Incorrect VPMOVB2M + KMOV -> VPMOVMSKB transformation causes incorrect results
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
aneshlya
|
Commit 1caf2704dd6791baa4b958d6a666ea64ec24795d ("[X86] Allow EVEX compression for mask registers (#171980)") introduces a regression that causes incorrect code generation for AVX-512 vectorized code.
The transformation attempts to compress the following pattern:
```asm
vpmov*2m %xmm0, %k0 -> (erase)
kmov* %k0, %eax -> vmovmsk* %xmm0, %eax
```
However, this transformation produces incorrect results in certain code patterns involving masked operations in loops on AVX512 SKX+.
## Assembly Difference
**Before (correct) - commit b8f5cbba2abb:**
```asm
switchit___vyi:
# %bb.0:
vpsllw $7, %xmm1, %xmm1
vpmovb2m %xmm1, %k0 ; <-- AVX-512 instruction
kmovd %k0, %eax ; <-- separate move from mask reg
andl $65534, %eax
je .LBB3_1
...
```
**After (incorrect) - commit 1caf2704dd67:**
```asm
switchit___vyi:
# %bb.0:
vpsllw $7, %xmm1, %xmm1
vpmovmskb %xmm1, %eax ; <-- Compressed to single instruction
andl $65534, %eax
je .LBB3_1
...
```
Compiler explorer link: https://ispc.godbolt.org/z/59Mnnvavf
The test used in the reproducer produces incorrect results at runtime:
| Lane | Expected | Actual |
|------|----------|--------|
| 2 | 4.0 | 2.0 |
| 3 | 9.0 | 3.0 |
| 4 | 24.0 | 6.0 |
| 5 | 35.0 | 7.0 |
| 6 | 48.0 | 8.0 |
| 7 | 63.0 | 9.0 |
| 8 | 144.0 | 18.0 |
| ... | ... | ... |
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs