Issue 160920
Summary Wrong code for avx512 intrinsic
Labels new issue
Assignees
Reporter rockeet
    ```c++
#include <immintrin.h>
using byte_t = unsigned char;
size_t avx512_search_byte_max32(const byte_t* data, size_t len, byte_t key) {
	return _tzcnt_u32(_mm256_mask_cmpge_epi8_mask(_bzhi_u32(-1, len),
 *(__m256i_u*)data, _mm256_set1_epi8(key)));
}
```
clang generate code:
```nasm
avx512_search_byte_max32(unsigned char const*, unsigned long, unsigned char):
        vpbroadcastb    ymm0, edx
        mov     eax, -1
 bzhi    eax, eax, esi
        vpcmpleb        k0, ymm0, ymmword ptr [rdi]
        kmovd   ecx, k0
        and     ecx, eax
        tzcnt eax, ecx
        vzeroupper
        ret
```
intel icc generate code:
```nasm
avx512_search_byte_max32(unsigned char const*, unsigned long, unsigned char):
..B3.1:                         # Preds ..B3.0
        mov eax, -1                                       #20.20
 vpbroadcastb ymm0, edx                                  #20.20
        bzhi ecx, eax, esi                                 #20.20
        kmovd k1, ecx                                       #20.20
        vpcmpb k0{k1}, ymm0, YMMWORD PTR [rdi], 2            #20.20
        kmovd     r8d, k0                                       #20.20
        tzcnt     eax, r8d #20.9
        vzeroupper #20.9
        ret #20.9
```

The issue is: if [data, data+32) spans page boundary and bytes corresponding to 0 in mask is invalid address(such as segfault/bus error), code generated by clang will coredump.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to