On Mon, May 13, 2013 at 5:50 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > pgbench -S is such a workload. With 9.3beta1, I'm seeing this profile, when > I run "pgbench -S -c64 -j64 -T60 -M prepared" on a 32-core Linux machine: > > - 64.09% postgres postgres [.] tas > - tas > - 99.83% s_lock > - 53.22% LWLockAcquire > + 99.87% GetSnapshotData > - 46.78% LWLockRelease > GetSnapshotData > + GetTransactionSnapshot > + 2.97% postgres postgres [.] tas > + 1.53% postgres libc-2.13.so [.] 0x119873 > + 1.44% postgres postgres [.] GetSnapshotData > + 1.29% postgres [kernel.kallsyms] [k] arch_local_irq_enable > + 1.18% postgres postgres [.] AllocSetAlloc > ... > > So, on this test, a lot of time is wasted spinning on the mutex of > ProcArrayLock. If you plot a graph of TPS vs. # of clients, there is a > surprisingly steep drop in performance once you go beyond 29 clients > (attached, pgbench-lwlock-cas-local-clients-sets.png, red line). My theory > is that after that point all the cores are busy, and processes start to be > sometimes context switched while holding the spinlock, which kills > performance.
I have, I also used linux perf to come to this conclusion, and my determination was similar: a system was undergoing increasingly heavy load, in this case with processes >> number of processors. It was also a phase-change type of event: at one moment everything would be going great, but once a critical threshold was hit, s_lock would consume enormous amount of CPU time. I figured preemption while in the spinlock was to blame at the time, given the extreme nature. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers