Issue 140729
Summary [X86] Use GFNI for LZCNT vXi8 ops
Labels good first issue, backend:X86
Assignees
Reporter RKSimon
    We can perform a LZCNT for v16i8/v32i8/v64i8 using GFNI instructions

As detailed here: https://gist.github.com/animetosho/6cb732ccb5ecd86675ca0a442b3c0622

Even though its based on the TZCNT implementation its easier to start with LZCNT as x86 vector TZCNT instructions all currently expand to CTPOP patterns with TargetLowering::expandCTTZ and that might need some messy legality cleanups.

e.g.
```c
__m128i _mm_lzcnt_epi8(__m128i x) {
	// just reverse bits and TZCNT
	__m128i a = _mm_gf2p8affine_epi64_epi8(x, _mm_set_epi32(0x80402010, 0x08040201, 0x80402010, 0x08040201), 0);
	a = _mm_andnot_si128(_mm_add_epi8(a, _mm_set1_epi8(0xff)), a);
	return _mm_gf2p8affine_epi64_epi8(a, _mm_set_epi32(0xaaccf0ff, 0, 0xaaccf0ff, 0), 8);
}
```

```asm
.LCPI0_2:
  .byte 1 # 0x1
  .byte 2 # 0x2
  .byte 4 # 0x4
  .byte 8 # 0x8
  .byte 16 # 0x10
  .byte 32 # 0x20
  .byte 64 # 0x40
  .byte 128 # 0x80
.LCPI0_3:
  .byte 0 # 0x0
  .byte 0 # 0x0
 .byte 0 # 0x0
  .byte 0 # 0x0
  .byte 255 # 0xff
  .byte 240 # 0xf0
 .byte 204 # 0xcc
  .byte 170 # 0xaa
_mm_lzcnt_epi8(long long vector[2]): # @_mm_lzcnt_epi8(long long vector[2])
# %bb.0: # %entry
  vgf2p8affineqb $0, .LCPI0_2(%rip){1to2}, %xmm0, %xmm0
  vpxor %xmm1, %xmm1, %xmm1
  vpsubb %xmm0, %xmm1, %xmm1
  vpand %xmm1, %xmm0, %xmm0
  vgf2p8affineqb $8, .LCPI0_3(%rip){1to2}, %xmm0, %xmm0
  retq
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to