https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88698
--- Comment #9 from Marc Glisse <glisse at gcc dot gnu.org> --- (In reply to Devin Hussey from comment #2) > What I am saying is that I think -flax-vector-conversions should be default, > or we should only have minimal warnings instead of errors. > > That will make generic vectors much easier to use. And more confusing / error-prone, there is a compromise. > typedef uint32_t u32x4 __attribute__((vector_size(16))); > > u32x4 shift(u32x4 val) > { > return _mm_srli_epi32(val, 15); > } Indeed, when calling an intrinsic, it could make sense to allow other vector types of the same size. Or would you expect the same behavior if you were calling your own function instead of _mm_srli_epi32? > 3. Cast. Good lord, if you thought intrinsics were ugly, this will change > your mind: > > return (u32x4)_mm_srli_epi32((__m128i)val, 15); It isn't that bad. First, if you only use intrinsics, you shouldn't define u32x4, then you only have __m128i, __m128 and __m128d, fewer conversions are needed. Then, if you do define u32x4, you can rewrite that as return val >> 15; > This is the second issue: unsigned long and unsigned int are the same size > and should have no issues converting between each other. We could special case this. But note that in C/C++, we don't consider int and long as the same type just because they have the same size, and reinterpreting int* as long* violates strict aliasing. > typedef unsigned u32x4 __attribute__((vector_size(16))); > typedef unsigned long long u64x2 __attribute__((vector_size(16))); > > u64x2 cast(u32x4 val) > { > return val; > } > > > This should emit a warning without a cast. I would recommend an error, but > Clang without -Wvector-conversion accepts this without any complaining. At some point it isn't easy to have a different behavior for an implicit conversion in different contexts. Should the intrinsics be marked with some magic flag that asks to be lax about their arguments? (In reply to Devin Hussey from comment #5) > Clang even allows this: > > #include <arm_neon.h> > > uint32x4_t mult(uint16x8_t top, uint32x4_t bot) > { > return top * bot; > } We clearly don't want that...