http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53907
--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-10 09:16:29 UTC --- s.0_2 = (sizetype) s_1(D); D.4303_3 = s.0_2 & 15; D.4304_4 = -D.4303_3; s_5 = s_1(D) + D.4304_4; sz_6 = MEM[(const __m128i * {ref-all})s_5]; return sz_6; it's hard to see for GCC that s_5 is aligned because of the way it is computed. If I change the source to use #include <emmintrin.h> __m128i x(char *s){ __m128i sz,z,mvec; s=(char *)(((unsigned long) s)& -16); sz=_mm_load_si128(s); return sz; } then GCC sees that s is aligned. The code generated for the re-alignment is also simpler, thus we should do this transform somewhere. I'll prepare a patch.