Re: [zfs-discuss] surprisingly poor performance

Brent Jones Fri, 03 Jul 2009 00:04:42 -0700

On Thu, Jul 2, 2009 at 11:39 PM, James Lever<j...@jamver.id.au> wrote:
> Hi All,
>
> We have recently acquired hardware for a new fileserver and my task, if I
> want to use OpenSolaris (osol or sxce) on it is for it to perform at least
> as well as Linux (and our 5 year old fileserver) in our environment.
>
> Our current file server is a whitebox Debian server with 8x 10,000 RPM SCSI
> drives behind an LSI MegaRaid controller with a BBU.  The filesystem in use
> is XFS.
>
> The raw performance tests that I have to use to compare them are as follows:
>
>  * Create 100,000 0 byte files over NFS
>  * Delete 100,000 0 byte files over NFS
>  * Repeat the previous 2 tasks with 1k files
>  * Untar a copy of our product with object files (quite a nasty test)
>  * Rebuild the product "make -j"
>  * Delete the build directory
>
> The reason for the 100k files tests is that this has been proven to be a
> significant indicator of desktop performance on the desktop systems of the
> developers.
>
> Within the budget we had, we have purchased the following system to meet our
> goals - if the OpenSolaris tests do not meet our requirements, it is certain
> that the equivalent tests under Linux will.  I'm the only person here who
> wants OpenSolaris specificially so it is in my interest to try to get it
> working at least on par if not better than our current system.  So here I am
> begging for further help.
>
> Dell R710
> 2x 2.40 Ghz Xeon 5330 CPU
> 16GB RAM (4x 4GB)
>
> mpt0 SAS 6/i (LSI 1068E)
> 2x 1TB SATA-II drives (rpool)
> 2x 50GB Enterprise SSD (slog) - Samsung MCCOE50G5MPQ-0VAD3
>
> mpt1 SAS 5/E (LSI 1068E)
> Dell MD1000 15-bay External storage chassis with 2 heads
> 10x 450GB Seagate Cheetah 15,000 RPM SAS
>
> We also have a PERC 6/E w/512MB BBWC to test with or fall back to if we go
> with a Linux solution.
>
> I have installed OpenSolaris 2009.06 and updated to b117 and used mdb to
> modify the kernel to work around a current bug in b117 with the newer Dell
> systems.
>  http://bugs.opensolaris.org/bugdatabase/view_bug.do%3Bjsessionid=76a34f41df5bbbfc2578934eeff8?bug_id=6850943
>
> Keeping in mind that with these tests, the external MD1000 chassis is
> connected with a single 4 lane SAS cable which should give 12Gbps or 1.2GBps
> of throughput.
>
> Individually, each disk exhibits about 170MB/s raw write performance.  e.g.
>
> jam...@scalzi:~$ pfexec dd if=/dev/zero of=/dev/rdsk/c8t5d0 bs=65536
> count=32768
> 2147483648 bytes (2.1 GB) copied, 12.4934 s, 172 MB/s
>
> A single spindle zpool seems to perform OK.
>
> jam...@scalzi:~$ pfexec zpool create single c8t20d0
> jam...@scalzi:~$ pfexec dd if=/dev/zero of=/single/foo bs=65536 count=327680
> 21474836480 bytes (21 GB) copied, 127.201 s, 169 MB/s
>
> RAID10 tests seem to be quite slow (about half the speed I would have
> expected - 170*5 = 850, I would have expected to see around 800MB/s)
>
> jam...@scalzi:~$ pfexec zpool create fastdata mirror c8t9d0 c8t10d0 mirror
> c8t11d0 c8t15d0 mirror c8t16d0 c8t17d0 mirror c8t18d0 c8t19d0 mirror c8t20d0
> c8t21d0
>
> jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
> count=163840
> 21474836480 bytes (21 GB) copied, 50.3066 s, 427 MB/s
>
> a 5 disk stripe seemed to perform as expected
>
> jam...@scalzi:~$ pfexec zpool create fastdata c8t10d0 c8t15d0 c8t17d0
> c8t19d0 c8t21d0
>
> jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
> count=163840
> 21474836480 bytes (21 GB) copied, 27.7972 s, 773 MB/s
>
> but a 10 disk stripe did not increase significantly
>
> jam...@scalzi:~$ pfexec zpool create fastdata c8t10d0 c8t15d0 c8t17d0
> c8t19d0 c8t21d0 c8t20d0 c8t18d0 c8t16d0 c8t11d0 c8t9d0
>
> jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
> count=163840
> 21474836480 bytes (21 GB) copied, 26.1189 s, 822 MB/s
>
> The best sequential write test I could elicit with redundancy was a pool
> with 2x 5 disk RAIDZ's striped
>
> jam...@scalzi:~$ pfexec zpool create fastdata raidz c8t10d0 c8t15d0 c8t16d0
> c8t11d0 c8t9d0 raidz c8t17d0 c8t19d0 c8t21d0 c8t20d0 c8t18d0
>
> jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
> count=163840
> 21474836480 bytes (21 GB) copied, 31.3934 s, 684 MB/s
>
> Moving onto testing NFS and trying to perform the create 100,000 0 byte
> files (aka, the metadata and NFS sync test).  The test seemed to be likely
> to take about half an hour without a slog as I worked out when I killed it.
>  Painfully slow.  So I added one of the SSDs to the system as a slog which
> improved matters.  The client is a Red Hat Enterprise Linux server on modern
> hardware and has been used for all tests against our old fileserver.
>
> The time to beat: RHEL5 client to Debian4+XFS server:
>
> bash-3.2# time tar xf zeroes.tar
>
> real    2m41.979s
> user    0m0.420s
> sys     0m5.255s
>
> And on the currently configured system:
>
> jam...@scalzi:~$ pfexec zpool create fastdata mirror c8t9d0 c8t10d0 mirror
> c8t11d0 c8t15d0 mirror c8t16d0 c8t17d0 mirror c8t18d0 c8t19d0 mirror c8t20d0
> c8t21d0 log c7t2d0
>
> jam...@scalzi:~$ pfexec zfs set sharenfs='rw,ro...@10.1.0/23' fastdata
>
> bash-3.2# time tar xf zeroes.tar
>
> real    8m7.176s
> user    0m0.438s
> sys     0m5.754s
>
> While this was running, I was looking at the output of zpool iostat fastdata
> 10 to see how it was going and was surprised to see the seemingly low IOPS.
>
> jam...@scalzi:~$ zpool iostat fastdata 10
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> fastdata    10.0G  2.02T      0    312    268  3.89M
> fastdata    10.0G  2.02T      0    818      0  3.20M
> fastdata    10.0G  2.02T      0    811      0  3.17M
> fastdata    10.0G  2.02T      0    860      0  3.27M
>
> Strangely, when I added a second SSD as a second slog, it made no difference
> to the write operations.
>
> I'm not sure where to go from here, these results are appalling (about 3x
> the time of the old system with 8x 10kRPM spindles) even with two Enterprise
> SSDs as separate log devices.
>
> cheers,
> James
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


Are you sure the slog is working right? Try disabling the ZIL to see
if that helps with your NFS performance.
If your performance increases a hundred fold, I'm suspecting the slog
isn't perming well, or even doing its job at all.

-- 
Brent Jones
br...@servuhome.net
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] surprisingly poor performance

Reply via email to