https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62080
--- Comment #3 from Benjamin Schindler <beschindler at gmail dot com> --- I just looked at what gcc-4.9.1 does and it does vary: movdqu (%rsi), %xmm1 movdqu (%rdi), %xmm0 <-- pminsd %xmm1, %xmm0 <-- pxor %xmm1, %xmm1 pmaxsd %xmm1, %xmm0 movaps %xmm0, (%rsi) So, the first version still has a needless movdqu (for which I don't know how much it hurts). Second version movdqa (%rsi), %xmm0 pminsd (%rdi), %xmm0 <-- good pxor %xmm1, %xmm1 movdqu %xmm0, %xmm0 <-- bad? pmaxsd %xmm1, %xmm0 movaps %xmm0, (%rsi) So, gcc-4.9 fares better such that it does not go to memory, but it emits an odd mov instruction. May be this is a separate issue?