On Wed, Feb 10, 2010 at 10:10:25PM +1100, Anton Blanchard wrote: > > Nick Piggin discovered that lwsync barriers around locks were faster than > isync > on 970. That was a long time ago and I completely dropped the ball in testing > his patches across other ppc64 processors. > > Turns out the idea helps on other chips. Using a microbenchmark that > uses a lot of threads to contend on a global pthread mutex (and therefore a > global futex), POWER6 improves 8% and POWER7 improves 2%. I checked POWER5 > and while I couldn't measure an improvement, there was no regression. > > This patch uses the lwsync patching code to replace the isyncs with lwsyncs > on CPUs that support the instruction. We were marking POWER3 and RS64 as > lwsync > capable but in reality they treat it as a full sync (ie slow). Remove the > CPU_FTR_LWSYNC bit from these CPUs so they continue to use the faster isync > method. > > Signed-off-by: Anton Blanchard <an...@samba.org>
Turns out this one hurts PA6T performance quite a bit, lwsync seems to be significantly more expensive there. I see a 25% drop in the microbenchmark doing pthread_lock/unlock loops on two cpus. Taking out the CPU_FTR_LWSYNC will solve it, it's a bit unfortunate since the sync->lwsync changes definitely still can, and should, be done. -Olof _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev