With -march=pentium4 -mfpmath=sse -O2, we get an extra move for code like

    double d = atof(foo);
    int i = d;

        call    atof
        fstpl   -8(%ebp)
        movsd   -8(%ebp), %xmm0
        cvttsd2si       %xmm0, %eax

(This is Linux, Darwin is similar.)  I think the difficulty is that for

(set (reg/v:DF 58 [ d ]) (reg:DF 8 st)) 64 {*movdf_nointeger}

regclass decides SSE_REGS is a zero-cost choice for 58.  Which looks
wrong, as that requires a store and load from memory. In fact, memory is the cheapest overall choice for 58 (taking its use into account also), and gcc will figure that out correctly if a more reasonable assessment is given
to SSE_REGS.  The immediate cause is the #Y's in the constraint:

"=f#Y,m ,f#Y,*r ,o ,Y*x#f,Y*x#f,Y*x#f ,m "

and there's probably a simple fix, but it eludes me.  Advice?  Thanks.

Reply via email to