http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56788
--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Marc Glisse from comment #7) > Ah no, I was wrong, sorry about that: > > The VFRCZSS and VFRCZSD instructions extract the fractional portion of the > single-/double-precision scalar floating-point value in an XMM register or > 32- or 64-bit memory location and writes the result in the lower element of > the destination register. The upper elements of the destination XMM register > are unaffected by the operation, while the upper 128 bits of the > corresponding YMM register are cleared to zeros > > http://support.amd.com/TechDocs/43479.pdf Hm from the same document, I read (v3.04, page 122) for vfrczsd: When the result is written to the destination XMM register, the upper quadword of the destination register and the upper 128-bits of the corresponding YMM register are cleared to zeros. Page 126, vfrczss: When the result is written to the destination XMM register, the upper three doub lewords of the destination register and the upper 128-bits of the corresponding YMM register are cleared to zeros. The upper 224 bits of the YMM destination register are cleared to zeros. So, the instruction itself *does* clear upper bits to zero.