https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Joel Yliluoma from comment #13) > GCC 4.1.2 is indicated in the bug report headers. > Luckily, Compiler Explorer has a copy of that exact version, and it indeed > vectorizes the second function: https://godbolt.org/z/DC_SSb > > On my own system, the earliest I have is 4.6. The Compiler Explorer has 4.4, > and it, or anything newer than that, no longer vectorizes either function. Ah, OK - that's before GCC learned vectorization and is code-generated by RTL expanding return {BIT_FIELD_REF <a, 128, 0> + BIT_FIELD_REF <b, 128, 0>}; so the only vector support was GCCs generic vectors (and intrinsics). The generated code is far from perfect though. I also think llvms code generation is bogus since it appears the ABI does not guarantee zeroed upper elements of the xmm0 argument which means they could contain sNaNs: typedef float ss2 __attribute__((vector_size(8))); typedef float ss4 __attribute__((vector_size(16))); ss2 add2(ss2 a, ss2 b); void bar(ss4 a) { volatile ss2 x; x = add2 ((ss2){a[0], a[1]}, (ss2){a[0], a[1]}); } produces bar: .LFB1: .cfi_startproc subq $56, %rsp .cfi_def_cfa_offset 64 movdqa %xmm0, %xmm1 call add2 movq %xmm0, 24(%rsp) addq $56, %rsp which means we pass through 'a' unchanged.