We have encountered some serious SMP performance/scalability problems
that we've tracked back to lstat/namei calls. I've written a quick
benchmark with a pair of tests to simplify/measure the problem. Both
tests use a tree of directories: the top level directory contains five
subdirectories a, b, c, d, and e. Each subdirectory contains five
subdirectories a, b, c, d, and e, and so on.. 1 directory at level
one, 5 at level two, 25 at level three, 125 at level four, 625 at
level five, and 3125 at level six.
In the "realpath" test, a random path is constructed at the bottom of
the tree (e.g. /tmp/lstat/a/b/c/d/e) and realpath() is called on that,
provoking lstat() calls on the whole tree. This is to simulate a mix
of high-contention and low-contention lstat() calls.
In the "lstat" test, lstat is called directly on a path at the bottom
of the tree. Since there are 3125 files, this simulates relatively
low-contention lstat() calls.
In both cases, the test repeats as many times as possible for 60
seconds. Each test is run simultaneously by multiple processes, with
progressively doubling concurrency from 1 to 512.
What I found was that everything is fine at concurrency 2, probably
indicating that the benchmark pegged on some other resource limit. At
concurrency 4, realpath drops to 31.8% of concurrency 1. At
concurrency 8, performance is down to 18.3%. In the interim, CPU load
goes to 80-90% system CPU. I've confirmed via ktrace and the rusage
that the CPU usage is all system time, and that lstat() is the *only*
system call in the test (realpath() is called with an absolute path).
I then reran the 32-process test on 1-7 cores, and found that
performance peaks at 2 cores and drops sharply from there. eight
cores runs *fifteen* times slower than two cores.
The test full results are at the bottom of this message.
This is on 6.3-RELEASE-p4 with vfs.lookup_shared=1.
I believe this is the same issue that was previously discussed as "2 x
quad-core system is slower that 2 x dual core on FreeBSD" archived here:
http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038441.html
In that post, Kris Kennaway wrote:
> It is hard to say for certain without a direct profile comparison
of the
> workload, but it is probably due to lockmgr contention. lockmgr is
used
> for various locking operations to do with VFS data structures. It is
> known to have poor performance and scale very badly."
At this point, what I've got is one of those synthetic benchmarks, but
it matches our production problems exactly, except that the production
processes need a whole lot more RAM and eventually when this
manifests, they backlog and the server death spirals through swap,
which is a most unfortunate difference.
I've chased my way up the kernel source to kern_lstat(), where a
shared lock is obtained, and then onto namei, where vfs.lookup_shared
comes into play. But unfortunately, I don't understand lockmgr, I
don't know how the macros and flags I see here relate to it, I can't
figure out what happened to the changes that Attilio Rao was working
on, and there didn't seem to be much other hope at the time.
This is becoming a huge problem for us. Is there anything that at all
can be done, or any news? In the case linked above, improvement was
made by changing a PHP setting that isn't applicable in our case.
Thanks,
Jeff
Concurrency 1
realpath
Total = 1409069 (100%)
Total/Sec = 23484
Total/Sec/Worker = 23484
lstat
Total = 6828763 (100%)
Total/Sec = 113812
Total/Sec/Worker = 113812
Concurrency 2
realpath
Total = 1450489 (100%)
Total/Sec = 24174
Total/Sec/Worker = 12087
lstat
Total = 6891417 (100.9%)
Total/Sec = 114856
Total/Sec/Worker = 57428
Concurrency 4
realpath
Total = 448693 (31.8%)
Total/Sec = 7478
Total/Sec/Worker = 1869
lstat
Total = 3047933 (44.6%)
Total/Sec = 50798
Total/Sec/Worker = 12699
Concurrency 8
realpath
Total = 258281 (18.3%)
Total/Sec = 4304
Total/Sec/Worker = 538
lstat
Total = 1688728 (24.7%)
Total/Sec = 28145
Total/Sec/Worker = 3518
Concurrency 16
realpath
Total = 179150 (12.7%)
Total/Sec = 2985
Total/Sec/Worker = 186
lstat
Total = 966558 (14.1%)
Total/Sec = 16109
Total/Sec/Worker = 1006
Concurrency 32
realpath
Total = 116982 (8.3%)
Total/Sec = 1949
Total/Sec/Worker = 60
lstat
Total = 644703 (9.4%)
Total/Sec = 10745
Total/Sec/Worker = 335
Concurrency 64
realpath
Total = 112050 (7.9%)
Total/Sec = 1867
Total/Sec/Worker = 29
lstat
Total = 572798 (8.3%)
Total/Sec = 9546
Total/Sec/Worker = 149
Concurrency 128
realpath
Total = 111544 (7.9%)
Total/Sec = 1859
Total/Sec/Worker = 14
lstat
Total = 570800 (8.3%)
Total/Sec = 9513
Total/Sec/Worker = 74
Concurrency 256
realpath
Total = 96461 (6.8%)
Total/Sec = 1607
Total/Sec/Worker = 6
lstat
Total = 580679 (8.5%)
Total/Sec = 9677
Total/Sec/Worker = 37
Concurrency 512
realpath
Total = 91224 (6.4%)
Total/Sec = 1520
Total/Sec/Worker = 2
lstat
Total = 498342 (7.2%)
Total/Sec = 8305
Total/Sec/Worker = 16
realpath Concurrency 32 - 1 Core
Total = 1289527
Total/Sec = 21492
Total/Sec/Worker = 671
realpath Concurrency 32 - 2 Core
Total = 1753625
Total/Sec = 29227
Total/Sec/Worker = 913
realpath Concurrency 32 - 3 Core
Total = 1197896
Total/Sec = 19964
Total/Sec/Worker = 623
realpath Concurrency 32 - 4 Core
Total = 631293
Total/Sec = 10521
Total/Sec/Worker = 328
realpath Concurrency 32 - 5 Core
Total = 227814
Total/Sec = 3796
Total/Sec/Worker = 118
realpath Concurrency 32 - 6 Core
Total = 153550
Total/Sec = 2559
Total/Sec/Worker = 79
realpath Concurrency 32 - 7 Core
Total = 136013
Total/Sec = 2266
Total/Sec/Worker = 70
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"