https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84508
--- Comment #23 from Jeffrey Walton <noloader at gmail dot com> --- (In reply to Peter Cordes from comment #22) > [...] > That instruction is useless and should never be used in asm except for > code-alignment reasons (1 byte longer than MOVLPS, same length as MOVSD, all > three doing the same thing for the memory-destination form). But easy to > imagine some code using that intrinsic to store an unaligned double into a > byte buffer. Reading from and writing to a [unaligned] byte stream in 4 or 8 byte chunks is our use case. Eventually, we need to perform traditional SIMD processing. But the loads and stores have to occur using these old instrinsics due to the word types, data stream format and supported ISA's. I believe the other option is to memcpy the byte stream into a properly aligned intermediate buffer. But that could incur a performance hit if the optimizer misses the opportunity (and fails to elide the memcpy).