http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50567
--- Comment #2 from Siddhesh Poyarekar <siddhesh.poyarekar at gmail dot com> 2011-09-29 15:24:52 UTC --- Thanks, that eliminated the spill to stack. The extra xmm1 to xmm0 move still remains: process: .LFB0: .cfi_startproc movq (%rdi), %rax cmpq %rsi, %rdi movq %rax, %rdx jae .L2 movq (%rsi), %rdx .L2: movd %rdx, %xmm1 pinsrq $1, %rax, %xmm1 movdqa %xmm1, %xmm0 ret