https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #5 from Wilco <wdijkstr at arm dot com> --- (In reply to Jakub Jelinek from comment #3) > If some arch in glibc implements memcpy.S and does not implement mempcpy.S, > then obviously the right fix is to add mempcpy.S for that arch, usually it > is just a matter of #include memcpy.S with some define USE_AS_MEMPCPY, and > change a couple of instructions in the assembly. You don't need to remember > the original value of dest, usually you have the address to which you store > bytes in some register, so it is just a matter of copying it to the return > register. We had a long discussion on this at the time. Very few targets have implemented mempcpy.S so it would be a lot of effort to do so. Sharing a single implementation of memcpy is typically not possible without slowing down memcpy (which is more important), and thus you end up with 2 separate implementations which creates unnecessary cache pollution. So overall I believe the best option for most targets is to defer to memcpy with an extra add instruction to compute the end. We could do that in GCC rather than in the GLIBC headers, but then it would be harder to actually call mempcpy in the case that a target supports an optimized implementation.