On 08/09/2009, at 2:01 AM, Ross Walker wrote:
On Sep 7, 2009, at 1:32 AM, James Lever <j...@jamver.id.au> wrote:
Well a MD1000 holds 15 drives a good compromise might be 2 7 drive
RAIDZ2s with a hotspare... That should provide 320 IOPS instead of
160, big difference.
The issue is interactive responsiveness and if there is a way to
tune the system to give that while still having good performance
for builds when they are run.
Look at the write IOPS of the pool with the zpool iostat -v and look
at how many are happening on the RAIDZ2 vdev.
I was suggesting that slog write were possibly starving reads from
the l2arc as they were on the same device. This appears not to
have been the issue as the problem has persisted even with the
l2arc devices removed from the pool.
The SSD will handle a lot more IOPS then the pool and L2ARC is a
lazy reader, it mostly just holds on to read cache data.
It just may be that the pool configuration just can't handle the
write IOPS needed and reads are starving.
Possible, but hard to tell. Have a look at the iostat results I’ve
posted.
The busy times of the disks while the issue is occurring should let
you know.
So it turns out that the problem is that all writes coming via NFS are
going through the slog. When that happens, the transfer speed to the
device drops to ~70MB/s (the write speed of his SLC SSD) and until the
load drops all new write requests are blocked causing a noticeable
delay (which has been observed to be up to 20s, but generally only
2-4s).
I can reproduce this behaviour by copying a large file (hundreds of MB
in size) using 'cp src dst’ on an NFS (still currently v3) client and
observe that all data is pushed through the slog device (10GB
partition of a Samsung 50GB SSD behind a PERC 6/i w/256MB BBC) rather
than going direct to the primary storage disks.
On a related note, I had 2 of these devices (both using just 10GB
partitions) connected as log devices (so the pool had 2 separate log
devices) and the second one was consistently running significantly
slower than the first. Removing the second device made an improvement
on performance, but did not remove the occasional observed pauses.
I was of the (mis)understanding that only metadata and writes smaller
than 64k went via the slog device in the event of an O_SYNC write
request?
The clients are (mostly) RHEL5.
Is there a way to tune this on the NFS server or clients such that
when I perform a large synchronous write, the data does not go via the
slog device?
I have investigated using the logbias setting, but that will just kill
small file performance also on any filesystem using it and defeat the
purpose of having a slog device at all.
cheers,
James
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss