http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48037
Marc Glisse <glisse at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |glisse at gcc dot gnu.org
--- Comment #9 from Marc Glisse <glisse at gcc dot gnu.org> 2012-10-23 18:42:28
UTC ---
(In reply to comment #8)
> The useless reg-reg moves prevail:
>
> _Z6vsqrt2U8__vectord:
> .LFB520:
> .cfi_startproc
> sqrtsd %xmm0, %xmm1
> unpckhpd %xmm0, %xmm0
> movapd %xmm1, %xmm2
> sqrtsd %xmm0, %xmm0
> unpcklpd %xmm0, %xmm2
> movapd %xmm2, %xmm0
> ret
>
> both movapds can be avoided by better register allocation.
With LRA we get one mov less:
_Z6vsqrt2U8__vectord:
.LFB521:
.cfi_startproc
sqrtsd %xmm0, %xmm2
unpckhpd %xmm0, %xmm0
sqrtsd %xmm0, %xmm1
movapd %xmm2, %xmm0
unpcklpd %xmm1, %xmm0
ret