https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Nemo from comment #5)
> (In reply to Jakub Jelinek from comment #2)
> 
> I would be happy if GCC could just emit optimal code (single vcmpeqd
> instruction) for this useful constant:
> 
>     _mm256_set_m128i(_mm_setzero_si128(), _mm_set1_epi8(-1))
> 
> aka.
> 
>     _mm256_inserti128_si256(_mm256_setzero_si256(), _mm_set1_epi8(-1), 0)
> 
> 
> (The latter is just what GCC uses to implement _mm256_zextsi128_si256, if I
> am reading the headers correctly.)
> 
> It's a minor thing, but I was a little surprised to find that none of the
> compilers I know of are able to do this. At least, not with any input I
> tried.

vmovdqa xmm0, xmm0 is not redundant here, it would clear up 128-256 bit which
is the meaning of `zext`.

Reply via email to