Hi,
James Lever wrote:
Hi All,
We have recently acquired hardware for a new fileserver and my task,
if I want to use OpenSolaris (osol or sxce) on it is for it to perform
at least as well as Linux (and our 5 year old fileserver) in our
environment.
Our current file server is a whitebox Debian server with 8x 10,000 RPM
SCSI drives behind an LSI MegaRaid controller with a BBU. The
filesystem in use is XFS.
The raw performance tests that I have to use to compare them are as
follows:
* Create 100,000 0 byte files over NFS
* Delete 100,000 0 byte files over NFS
* Repeat the previous 2 tasks with 1k files
* Untar a copy of our product with object files (quite a nasty test)
* Rebuild the product "make -j"
* Delete the build directory
The reason for the 100k files tests is that this has been proven to be
a significant indicator of desktop performance on the desktop systems
of the developers.
Within the budget we had, we have purchased the following system to
meet our goals - if the OpenSolaris tests do not meet our
requirements, it is certain that the equivalent tests under Linux
will. I'm the only person here who wants OpenSolaris specificially so
it is in my interest to try to get it working at least on par if not
better than our current system. So here I am begging for further help.
Dell R710
2x 2.40 Ghz Xeon 5330 CPU
16GB RAM (4x 4GB)
mpt0 SAS 6/i (LSI 1068E)
2x 1TB SATA-II drives (rpool)
2x 50GB Enterprise SSD (slog) - Samsung MCCOE50G5MPQ-0VAD3
mpt1 SAS 5/E (LSI 1068E)
Dell MD1000 15-bay External storage chassis with 2 heads
10x 450GB Seagate Cheetah 15,000 RPM SAS
We also have a PERC 6/E w/512MB BBWC to test with or fall back to if
we go with a Linux solution.
I have installed OpenSolaris 2009.06 and updated to b117 and used mdb
to modify the kernel to work around a current bug in b117 with the
newer Dell systems. http://bugs.opensolaris.org/bugdatabase/view_bug.do%3Bjsessionid=76a34f41df5bbbfc2578934eeff8?bug_id=6850943
Keeping in mind that with these tests, the external MD1000 chassis is
connected with a single 4 lane SAS cable which should give 12Gbps or
1.2GBps of throughput.
Individually, each disk exhibits about 170MB/s raw write performance.
e.g.
jam...@scalzi:~$ pfexec dd if=/dev/zero of=/dev/rdsk/c8t5d0 bs=65536
count=32768
2147483648 bytes (2.1 GB) copied, 12.4934 s, 172 MB/s
A single spindle zpool seems to perform OK.
jam...@scalzi:~$ pfexec zpool create single c8t20d0
jam...@scalzi:~$ pfexec dd if=/dev/zero of=/single/foo bs=65536
count=327680
21474836480 bytes (21 GB) copied, 127.201 s, 169 MB/s
RAID10 tests seem to be quite slow (about half the speed I would have
expected - 170*5 = 850, I would have expected to see around 800MB/s)
jam...@scalzi:~$ pfexec zpool create fastdata mirror c8t9d0 c8t10d0
mirror c8t11d0 c8t15d0 mirror c8t16d0 c8t17d0 mirror c8t18d0 c8t19d0
mirror c8t20d0 c8t21d0
jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
count=163840
21474836480 bytes (21 GB) copied, 50.3066 s, 427 MB/s
a 5 disk stripe seemed to perform as expected
jam...@scalzi:~$ pfexec zpool create fastdata c8t10d0 c8t15d0 c8t17d0
c8t19d0 c8t21d0
jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
count=163840
21474836480 bytes (21 GB) copied, 27.7972 s, 773 MB/s
but a 10 disk stripe did not increase significantly
jam...@scalzi:~$ pfexec zpool create fastdata c8t10d0 c8t15d0 c8t17d0
c8t19d0 c8t21d0 c8t20d0 c8t18d0 c8t16d0 c8t11d0 c8t9d0
jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
count=163840
21474836480 bytes (21 GB) copied, 26.1189 s, 822 MB/s
The best sequential write test I could elicit with redundancy was a
pool with 2x 5 disk RAIDZ's striped
jam...@scalzi:~$ pfexec zpool create fastdata raidz c8t10d0 c8t15d0
c8t16d0 c8t11d0 c8t9d0 raidz c8t17d0 c8t19d0 c8t21d0 c8t20d0 c8t18d0
jam...@scalzi:~$ pfexec dd if=/dev/zero of=/fastdata/foo bs=131072
count=163840
21474836480 bytes (21 GB) copied, 31.3934 s, 684 MB/s
Moving onto testing NFS and trying to perform the create 100,000 0
byte files (aka, the metadata and NFS sync test). The test seemed to
be likely to take about half an hour without a slog as I worked out
when I killed it. Painfully slow. So I added one of the SSDs to the
system as a slog which improved matters. The client is a Red Hat
Enterprise Linux server on modern hardware and has been used for all
tests against our old fileserver.
The time to beat: RHEL5 client to Debian4+XFS server:
bash-3.2# time tar xf zeroes.tar
real 2m41.979s
user 0m0.420s
sys 0m5.255s
And on the currently configured system:
jam...@scalzi:~$ pfexec zpool create fastdata mirror c8t9d0 c8t10d0
mirror c8t11d0 c8t15d0 mirror c8t16d0 c8t17d0 mirror c8t18d0 c8t19d0
mirror c8t20d0 c8t21d0 log c7t2d0
jam...@scalzi:~$ pfexec zfs set sharenfs='rw,ro...@10.1.0/23' fastdata
bash-3.2# time tar xf zeroes.tar
real 8m7.176s
user 0m0.438s
sys 0m5.754s
While this was running, I was looking at the output of zpool iostat
fastdata 10 to see how it was going and was surprised to see the
seemingly low IOPS.
Have you tried running this locally on your OpenSolaris box - just to
get an idea of what it could deliver in terms of speed ? Which NFS
version are you using ?
jam...@scalzi:~$ zpool iostat fastdata 10
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
fastdata 10.0G 2.02T 0 312 268 3.89M
fastdata 10.0G 2.02T 0 818 0 3.20M
fastdata 10.0G 2.02T 0 811 0 3.17M
fastdata 10.0G 2.02T 0 860 0 3.27M
Strangely, when I added a second SSD as a second slog, it made no
difference to the write operations.
I'm not sure where to go from here, these results are appalling (about
3x the time of the old system with 8x 10kRPM spindles) even with two
Enterprise SSDs as separate log devices.
cheers,
James
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Med venlig hilsen / Best Regards
Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00
ScanNet Group
A/S ScanNet
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss