On Thu, 22 May 2014, Richard Sandiford wrote:
> I can't hope to match Maciej's reply on the details, but like he says,
> my understanding was that this:
>
> > ADD.D $f20, $f10, $f10
> > MOV.D $f18, $f20
> > SWC1 $f20, 0($sp)
> > MTC1 $2, $f20
> > LWC1 $f20, 0($sp)
> > ADD.D $f16, $f18, $f20
> > ($f16 should be 4*$f10)
>
> really is required to work for -modd-spreg. Specifically I thought the
> LWC1 would force $f20 to be become "uninterpreted" and then the ADD.D
> would need to (re)interpret the register pair as D-format.
But as I wrote it has always worked, even with the MIPS I R2010 FPU. All
the CP1 transfer instructions are non-arithmetic and operate on FPRs in
the raw manner (note that MOV.S and MOV.D do not fall in this class). And
they actually have to, or otherwise a sequence like this (typical for a
MIPS I function epilogue):
lwc1 $f20,0(sp)
# <- hw interrupt taken here
lwc1 $f21,4(sp)
jr ra
addiu sp,sp,8
would break on a system with a proper OS such as Linux if a hardware
interrupt exception was taken right after the first LWC1 that would cause
a context switch and the other process also switched (lazily) the FP
context. Once back to this process, the second LWC1 would fault with a
Coprocessor Unusable exception that would pull the whole FP context back.
Once that has completed the LWC1 instruction would only replace one half
of the double value stored in $f20 (note that the FP context switch
handler may well use LDC1 instructions if supported by hardware; Linux
does where possible).
Yes, it happens that we restore even-numbered FPRs first, before their
odd-numbered counterparts each, but neither the ABI nor hardware mandates
that, we could as well swap the order and that would have to work too.
Maciej