https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116274
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |crazylht at gmail dot com --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- struct a { int x,y,z,w; }; int test(struct a a) { return a.x+a.y+a.z+a.w; } behaves similarly. I do have a patch for the vectorizer costing that avoids vectorizing in these cases. We will still vectorize struct a { short a0,a1,a2,a3,a4,a5,a6,a7; }; short test(struct a a) { return a.a0+a.a1+a.a2+a.a3+a.a4+a.a5+a.a6+a.a7; } generating test: .LFB0: .cfi_startproc movaps %xmm1, -24(%rsp) movq -16(%rsp), %rdx movq %rdi, %xmm1 movq %rsi, %xmm3 pinsrq $1, %rdx, %xmm1 punpcklqdq %xmm3, %xmm1 movaps %xmm1, -24(%rsp) movdqa %xmm1, %xmm2 pinsrq $1, -16(%rsp), %xmm2 movdqa %xmm2, %xmm0 psrldq $8, %xmm0 paddw %xmm1, %xmm0 movdqa %xmm0, %xmm1 psrldq $4, %xmm1 paddw %xmm1, %xmm0 movdqa %xmm0, %xmm1 psrldq $2, %xmm1 paddw %xmm1, %xmm0 pextrw $0, %xmm0, %eax ret as opposed to test: .LFB0: .cfi_startproc movl %edi, %eax movq %rdi, %rdx sarl $16, %eax salq $16, %rdx addl %edi, %eax sarq $48, %rdx addl %edx, %eax sarq $48, %rdi movl %esi, %edx addl %edi, %eax sarl $16, %edx addl %esi, %eax addl %edx, %eax movq %rsi, %rdx sarq $48, %rsi salq $16, %rdx sarq $48, %rdx addl %edx, %eax addl %esi, %eax ret it still has the odd (dead) movaps %xmm1, -24(%rsp) movq -16(%rsp), %rdx The movaps %xmm1, -24(%rsp) movdqa %xmm1, %xmm2 pinsrq $1, -16(%rsp), %xmm2 codegen is probably an RA/LRA artifact caused by bad instruction constraints and the refuse to reload to a gpr. Not sure if a move high to gpr is a thing, pextrq would work for sure. But an unpck looks like a better match anyway.