On Wed, 2015-02-11 at 08:53 +0100, leroy christophe wrote: > In powerpc32 architecture there is a function called cacheable_memcpy() > which does same thing as memcpy() but using dcbz/dcbt instructions for > an optimised copy (just like __copy_tofrom_user()) > What seems strange is that it is almost nowhere used (only used in > drivers/net/ethernet/ibm/emac/core.c) > > For a try I replaced all memcpy() in include/linux/skbuff.h and > net/core/skbuff.c by cacheable_memcpy() and I got around 8% improvement > on FTP throughput on MPC885. > > What could be done to generalise the use of cacheable_memcpy() instead > of memcpy() whenever possible ? > Indeed, in order to use cacheable_memcpy(), we need > * The destination to be cacheable > * The source and destination to not overlap on the same cachelines > > Could we check, when calling memcpy(), whether the destination is > cacheable or not, and if yes redirect the call to cacheable_memcpy() ? > How can we check that ?
Additionally we could have a P8 implementation that uses unaligned vectors. Adding Anton to the CC list. Cheers, Ben. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev