Agner Fog wrote: > Basile STARYNKEVITCH wrote: >>At last, at the recent (july 2008) GCC summit, someone (sorry I forgot > who, probably someone from SuSE) >> proposed in a BOFS to have architecture and machine specific > hand-tuned (or even hand-written assembly) low >> level libraries for such basic things as memset etc.. > > That's exactly what I meant. The most important memory, string and math > functions should use hand-tuned assembly with CPU dispatching for the > latest instruction sets. My experiments show that the speed can be > improved by a factor 3 - 10 for unaligned memcpy on Intel processors > (http://www.agner.org/optimize/optimizing_cpp.pdf page 12).
Is this still true if you have to go through the PLT to make a position- independent call? That's the most common case for userspace on GNU/Linux. Andrew.