https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71264
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- So the issue is that the code is alread "vectorized" but the footype vector type has SImode and thus get_vectype_for_scalar_type happily returns V4SI as vector type. But the rest of the vectorizer isn't really prepared to handle vector types in the to-be-vectorized IL. I have a patch that ends up producing test: .LFB0: .cfi_startproc movd (%rsi), %xmm1 movdqu (%rdi), %xmm2 pshufd $0, %xmm1, %xmm0 pxor %xmm2, %xmm0 movups %xmm0, (%rdi) ret