https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93594
--- Comment #3 from andysem at mail dot ru --- ...and probably other permute variants involving zeroed input registers, e.g.: __m256i cvt_permute_zero_v1(__m128i low) { return _mm256_permute2x128_si256(_mm256_setzero_si256(), _mm256_castsi128_si256(low), 0x02); } __m256i cvt_permute_zero_v2(__m128i low) { return _mm256_permute2x128_si256(_mm256_castsi128_si256(low), _mm256_setzero_si256(), 0x20); } https://gcc.godbolt.org/z/Zpo9K7