Issue |
140729
|
Summary |
[X86] Use GFNI for LZCNT vXi8 ops
|
Labels |
good first issue,
backend:X86
|
Assignees |
|
Reporter |
RKSimon
|
We can perform a LZCNT for v16i8/v32i8/v64i8 using GFNI instructions
As detailed here: https://gist.github.com/animetosho/6cb732ccb5ecd86675ca0a442b3c0622
Even though its based on the TZCNT implementation its easier to start with LZCNT as x86 vector TZCNT instructions all currently expand to CTPOP patterns with TargetLowering::expandCTTZ and that might need some messy legality cleanups.
e.g.
```c
__m128i _mm_lzcnt_epi8(__m128i x) {
// just reverse bits and TZCNT
__m128i a = _mm_gf2p8affine_epi64_epi8(x, _mm_set_epi32(0x80402010, 0x08040201, 0x80402010, 0x08040201), 0);
a = _mm_andnot_si128(_mm_add_epi8(a, _mm_set1_epi8(0xff)), a);
return _mm_gf2p8affine_epi64_epi8(a, _mm_set_epi32(0xaaccf0ff, 0, 0xaaccf0ff, 0), 8);
}
```
```asm
.LCPI0_2:
.byte 1 # 0x1
.byte 2 # 0x2
.byte 4 # 0x4
.byte 8 # 0x8
.byte 16 # 0x10
.byte 32 # 0x20
.byte 64 # 0x40
.byte 128 # 0x80
.LCPI0_3:
.byte 0 # 0x0
.byte 0 # 0x0
.byte 0 # 0x0
.byte 0 # 0x0
.byte 255 # 0xff
.byte 240 # 0xf0
.byte 204 # 0xcc
.byte 170 # 0xaa
_mm_lzcnt_epi8(long long vector[2]): # @_mm_lzcnt_epi8(long long vector[2])
# %bb.0: # %entry
vgf2p8affineqb $0, .LCPI0_2(%rip){1to2}, %xmm0, %xmm0
vpxor %xmm1, %xmm1, %xmm1
vpsubb %xmm0, %xmm1, %xmm1
vpand %xmm1, %xmm0, %xmm0
vgf2p8affineqb $8, .LCPI0_3(%rip){1to2}, %xmm0, %xmm0
retq
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs