On Thursday 19 June 2008, Mark Nelson wrote: > * __copy_tofrom_user routine optimized for CELL-BE-PPC
A few things I noticed: * You don't have a page wise user copy, which the regular code has. This is probably not so noticable in iperf, but should have a significant impact on lmbench and on a number of file system tests that copy large amounts of data. Have you checked that the loop around cache lines is just as fast? * You don't align the source to word size, only the target. Does this get handled correctly when the source is a noncacheable mapping, e.g. an unaligned copy_from_user where the source points to a physical local store mapping of an SPU? I don't think we need to optimize this case for performance, but I'm not sure if it would crash. AFAIR, unaligned loads from noncacheable storage give you an alignment exception that you need to handle, right? * The naming of the labels (with just numbers) is rather confusing, it would be good to have something better, but I must admit that I don't have a good idea either. * The trick of using the condition code in cr7 for the last bytes is really cute, but are the four branches actually better than a single computed branch into the middle of 15 byte wise copies? Arnd <>< _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev