https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108141
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |crazylht at gmail dot com Target| |i?86-*-* --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- it looks better overall but the key difference is: - movzwl 8(%ebp), %edi + movzwl 8(%ebp), %eax + movw %ax, 30(%esp) ... - vmovd %edi, %xmm1 + vpbroadcastw 30(%esp), %xmm2 + vpbroadcastw 30(%esp), %ymm0 ... - vmovd %edi, %xmm0 ... - vpbroadcastw %xmm1, %xmm1 - vpbroadcastw %xmm0, %ymm0 I wonder whether optimal would be vpbroadcasstw 8(%ebp), %xmm2 vpbroadcasstw 8(%ebp), %ymm0 though.