https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80647

--- Comment #2 from Yale Zhang <yzhang1985 at gmail dot com> ---
Very interesting case. First, I didn't know unaligned loads were undefined
behavior on x86.

ICC 17 doesn't vectorize the loop probably because the destination and source
of the memmove() alias.

But apparently GCC knows how to vectorize memmove(). In this function, the
destination always comes before the source, so it's trivial to vectorize.
Vectorizing the case where destination > source is harder, and I wonder if GCC
can do that.


This is some legacy code from > 10 years ago. Manually vectorizing the
memmove() was too smart for modern compilers.

But the solution is simple. I'll just use the other simple, fallback
implementation used on unknown platforms. It's still vectorizable though.

thanks Andrew.

Reply via email to