https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66369
Uroš Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |ASSIGNED Last reconfirmed| |2015-06-03 Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Ever confirmed|0 |1 --- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> --- Created attachment 35693 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35693&action=edit Patch to add zero-extended MOVMSK patterns This patch adds zero-extended MOVMSK patterns. However, one more cast from (int) to (unsigned int) is needed in the source, due to the definition of the intrinsic: long v; regchx256 = _mm256_set1_epi8( ch ); regset256 = _mm256_loadu_si256( (__m256i const *) set ); v = (unsigned int) _mm256_movemask_epi8 ( _mm256_cmpeq_epi8(regchx256,regset256) ); Using patched gcc, the code compiles to: lookup32: vmovdqu charset32(%rip), %ymm0 # 10 *avx_loaddquv32qi vmovd %edi, %xmm1 # 54 vec_setv4si_0/4 movl $11141307, %eax # 5 *movdi_internal/3 vpbroadcastb %xmm1, %ymm1 # 55 avx2_pbroadcastv32qi vpcmpeqb %ymm0, %ymm1, %ymm0 # 13 *avx2_eqv32qi3 vpmovmskb %ymm0, %edx # 16 *avx2_pmovmskb_zext testl %edx, %edx # 19 *cmpsi_ccno_1/1 je .L5 # 20 *jcc_1 tzcntq %rdx, %rdx # 53 *ctzdi2_falsedep movq mytable+32(,%rdx,8), %rax # 28 *movdi_internal/4 .L5: vzeroupper # 51 avx_vzeroupper ret # 58 simple_return_internal