http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57954
--- Comment #11 from Evgeniy Dushistov <dushistov at mail dot ru> --- (In reply to Yuri Rumyantsev from comment #9) > I checked that zeroing of xmm register before conversion leads to > performance slowdown on SLM (-5%) for proveded test-case. I assume that > > with H.J patch we got the following assembly (I compiled it for slm but it > does not matter): > > .L3: > xorps %xmm0, %xmm0 > cvtsi2ss %eax, %xmm0 > movss %xmm0, (%ecx,%eax,4) > addl $1, %eax > cmpl %edx, %eax > jne .L3 > By the way, I tried compile my sample (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57988) for atom, icc(13.1.3 20130607) produce: xorps %xmm2,%xmm2 cvtsi2sd %rax,%xmm2 may be 5% measuring error?