comment below...
On Sep 23, 2009, at 10:00 PM, James Lever wrote:
On 08/09/2009, at 2:01 AM, Ross Walker wrote:
On Sep 7, 2009, at 1:32 AM, James Lever <j...@jamver.id.au> wrote:
Well a MD1000 holds 15 drives a good compromise might be 2 7 drive
RAIDZ2s with a hotspare... That should provide 320 IOPS instead of
160, big difference.
The issue is interactive responsiveness and if there is a way to
tune the system to give that while still having good performance
for builds when they are run.
Look at the write IOPS of the pool with the zpool iostat -v and
look at how many are happening on the RAIDZ2 vdev.
I was suggesting that slog write were possibly starving reads from
the l2arc as they were on the same device. This appears not to
have been the issue as the problem has persisted even with the
l2arc devices removed from the pool.
The SSD will handle a lot more IOPS then the pool and L2ARC is a
lazy reader, it mostly just holds on to read cache data.
It just may be that the pool configuration just can't handle the
write IOPS needed and reads are starving.
Possible, but hard to tell. Have a look at the iostat results
I’ve posted.
The busy times of the disks while the issue is occurring should let
you know.
So it turns out that the problem is that all writes coming via NFS
are going through the slog. When that happens, the transfer speed
to the device drops to ~70MB/s (the write speed of his SLC SSD) and
until the load drops all new write requests are blocked causing a
noticeable delay (which has been observed to be up to 20s, but
generally only 2-4s).
Thank you sir, can I have another?
If you add (not attach) more slogs, the workload will be spread across
them. But...
I can reproduce this behaviour by copying a large file (hundreds of
MB in size) using 'cp src dst’ on an NFS (still currently v3) client
and observe that all data is pushed through the slog device (10GB
partition of a Samsung 50GB SSD behind a PERC 6/i w/256MB BBC)
rather than going direct to the primary storage disks.
On a related note, I had 2 of these devices (both using just 10GB
partitions) connected as log devices (so the pool had 2 separate log
devices) and the second one was consistently running significantly
slower than the first. Removing the second device made an
improvement on performance, but did not remove the occasional
observed pauses.
...this is not surprising, when you add a slow slog device. This is
the weakest link rule.
I was of the (mis)understanding that only metadata and writes
smaller than 64k went via the slog device in the event of an O_SYNC
write request?
The threshold is 32 kBytes, which is unfortunately the same as the
default
NFS write size. See CR6686887
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6686887
If you have a slog and logbias=latency (default) then the writes go to
the slog.
So there is some interaction here that can affect NFS workloads in
particular.
The clients are (mostly) RHEL5.
Is there a way to tune this on the NFS server or clients such that
when I perform a large synchronous write, the data does not go via
the slog device?
You can change the IOP size on the client.
-- richard
I have investigated using the logbias setting, but that will just
kill small file performance also on any filesystem using it and
defeat the purpose of having a slog device at all.
cheers,
James
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss