On Thu, Jul 2, 2009 at 11:39 PM, James Lever<j...@jamver.id.au> wrote: > Hi All, > > We have recently acquired hardware for a new fileserver and my task, if I > want to use OpenSolaris (osol or sxce) on it is for it to perform at least > as well as Linux (and our 5 year old fileserver) in our environment. > > Our current file server is a whitebox Debian server with 8x 10,000 RPM SCSI > drives behind an LSI MegaRaid controller with a BBU. The filesystem in use > is XFS. > > The raw performance tests that I have to use to compare them are as follows: > > * Create 100,000 0 byte files over NFS > * Delete 100,000 0 byte files over NFS > * Repeat the previous 2 tasks with 1k files > * Untar a copy of our product with object files (quite a nasty test) > * Rebuild the product "make -j" > * Delete the build directory > > The reason for the 100k files tests is that this has been proven to be a > significant indicator of desktop performance on the desktop systems of the > developers. > > Within the budget we had, we have purchased the following system to meet our > goals - if the OpenSolaris tests do not meet our requirements, it is certain > that the equivalent tests under Linux will. I'm the only person here who > wants OpenSolaris specificially so it is in my interest to try to get it > working at least on par if not better than our current system. So here I am > begging for further help. > > Dell R710 > 2x 2.40 Ghz Xeon 5330 CPU > 16GB RAM (4x 4GB) > > mpt0 SAS 6/i (LSI 1068E) > 2x 1TB SATA-II drives (rpool) > 2x 50GB Enterprise SSD (slog) - Samsung MCCOE50G5MPQ-0VAD3 > > mpt1 SAS 5/E (LSI 1068E) > Dell MD1000 15-bay External storage chassis with 2 heads > 10x 450GB Seagate Cheetah 15,000 RPM SAS > > We also have a PERC 6/E w/512MB BBWC to test with or fall back to if we go > with a Linux solution. > > I have installed OpenSolaris 2009.06 and updated to b117 and used mdb to > modify the kernel to work around a current bug in b117 with the newer Dell > systems. > http://bugs.opensolaris.org/bugdatabase/view_bug.do%3Bjsessionid=76a34f41df5bbbfc2578934eeff8?bug_id=6850943 > > Keeping in mind that with these tests, the external MD1000 chassis is > connected with a single 4 lane SAS cable which should give 12Gbps or 1.2GBps > of throughput. > > Individually, each disk exhibits about 170MB/s raw write performance. e.g. > > jam...@scalzi:~$ pfexec dd if=/dev/zero of=/dev/rdsk/c8t5d0 bs=65536 > count=32768 > 2147483648 bytes (2.1 GB) copied, 12.4934 s, 172 MB/s > > A single spindle zpool seems to perform OK. > > jam...@scalzi:~$ pfexec zpool create single c8t20d0 > jam...@scalzi:~$ pfexec dd if=/dev/zero of=/single/foo bs=65536 count=327680 > 21474836480 bytes (21 GB) copied, 127.201 s, 169 MB/s > > RAID10 tests seem to be quite slow (about half the speed I would have > expected - 170*5 = 850, I would have expected to see around 800MB/s) > > jam...@scalzi:~$ pfexec zpool create fastdata mirror c8t9d0 c8t10d0 mirror > c8t11d0 c8t15d0 mirror c8t16d0 c8t17d0 mirror c8t18d0 c8t19d0 mirror c8t20d0 > c8t21d0 > > jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072 > count=163840 > 21474836480 bytes (21 GB) copied, 50.3066 s, 427 MB/s > > a 5 disk stripe seemed to perform as expected > > jam...@scalzi:~$ pfexec zpool create fastdata c8t10d0 c8t15d0 c8t17d0 > c8t19d0 c8t21d0 > > jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072 > count=163840 > 21474836480 bytes (21 GB) copied, 27.7972 s, 773 MB/s > > but a 10 disk stripe did not increase significantly > > jam...@scalzi:~$ pfexec zpool create fastdata c8t10d0 c8t15d0 c8t17d0 > c8t19d0 c8t21d0 c8t20d0 c8t18d0 c8t16d0 c8t11d0 c8t9d0 > > jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072 > count=163840 > 21474836480 bytes (21 GB) copied, 26.1189 s, 822 MB/s > > The best sequential write test I could elicit with redundancy was a pool > with 2x 5 disk RAIDZ's striped > > jam...@scalzi:~$ pfexec zpool create fastdata raidz c8t10d0 c8t15d0 c8t16d0 > c8t11d0 c8t9d0 raidz c8t17d0 c8t19d0 c8t21d0 c8t20d0 c8t18d0 > > jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072 > count=163840 > 21474836480 bytes (21 GB) copied, 31.3934 s, 684 MB/s > > Moving onto testing NFS and trying to perform the create 100,000 0 byte > files (aka, the metadata and NFS sync test). The test seemed to be likely > to take about half an hour without a slog as I worked out when I killed it. > Painfully slow. So I added one of the SSDs to the system as a slog which > improved matters. The client is a Red Hat Enterprise Linux server on modern > hardware and has been used for all tests against our old fileserver. > > The time to beat: RHEL5 client to Debian4+XFS server: > > bash-3.2# time tar xf zeroes.tar > > real 2m41.979s > user 0m0.420s > sys 0m5.255s > > And on the currently configured system: > > jam...@scalzi:~$ pfexec zpool create fastdata mirror c8t9d0 c8t10d0 mirror > c8t11d0 c8t15d0 mirror c8t16d0 c8t17d0 mirror c8t18d0 c8t19d0 mirror c8t20d0 > c8t21d0 log c7t2d0 > > jam...@scalzi:~$ pfexec zfs set sharenfs='rw,ro...@10.1.0/23' fastdata > > bash-3.2# time tar xf zeroes.tar > > real 8m7.176s > user 0m0.438s > sys 0m5.754s > > While this was running, I was looking at the output of zpool iostat fastdata > 10 to see how it was going and was surprised to see the seemingly low IOPS. > > jam...@scalzi:~$ zpool iostat fastdata 10 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > fastdata 10.0G 2.02T 0 312 268 3.89M > fastdata 10.0G 2.02T 0 818 0 3.20M > fastdata 10.0G 2.02T 0 811 0 3.17M > fastdata 10.0G 2.02T 0 860 0 3.27M > > Strangely, when I added a second SSD as a second slog, it made no difference > to the write operations. > > I'm not sure where to go from here, these results are appalling (about 3x > the time of the old system with 8x 10kRPM spindles) even with two Enterprise > SSDs as separate log devices. > > cheers, > James > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Are you sure the slog is working right? Try disabling the ZIL to see if that helps with your NFS performance. If your performance increases a hundred fold, I'm suspecting the slog isn't perming well, or even doing its job at all. -- Brent Jones br...@servuhome.net _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss