Re: [zfs-discuss] [nfs-discuss] NFS, ZFS & ESX

Dai Ngo Tue, 07 Jul 2009 17:55:38 -0700

Without any tuning, the default TCP window size and send buffer size for NFS

connections is around 48KB which is not very optimal for bulk transfer.However

the 1.4MB/s write seems to indicate something else is seriously wrong.

iSCSI performance was good, so the network connection seems to be OK(assuming

it's 1GbE).

What is your mount options look like?

I don't know what datastore browser does for copying file, but have youtried

the vanilla 'cp' command?

You can also try NFS performance using tmpfs, instead of ZFS, to make sure
NIC, protocol stack, NFS are not the culprit.

-Dai

erik.ableson wrote:

OK - I'm at my wit's end here as I've looked everywhere to find somemeans of tuning NFS performance with ESX into returning somethingacceptable using osol 2008.11. I've eliminated everything but the NFSportion of the equation and am looking for some pointers in the rightdirection.
Configuration: PE2950 bi pro Xeon, 32Gb RAM with an MD1000 using azpool of 7 mirror vdevs. ESX 3.5 and 4.0. Pretty much a vanillainstall across the board, no additional software other than theAdaptec StorMan to manage the disks.
local performance via dd - 463MB/s write, 1GB/s read (8Gb file)
iSCSI performance - 90MB/s write, 120MB/s read (800Mb file from a VM)
NFS performance - 1.4MB/s write, 20MB/s read (800Mb file from theService Console, transfer of a 8Gb file via the datastore browser)
I just found the tool latencytop which points the finger at the ZIL(tip of the hat to Lejun Zhu). Ref:<http://www.infrageeks.com/zfs/nfsd.png> &<http://www.infrageeks.com/zfs/fsflush.png>. Log file:<http://www.infrageeks.com/zfs/latencytop.log>
Now I can understand that there is a performance hit associated withthis feature of ZFS for ensuring data integrity, but this drastic adifference makes no sense whatsoever. The pool is capable of handlingnatively (at worst) 120*7 IOPS and I'm not even seeing enough tosaturate a USB thumb drive. This still doesn't answer why the readperformance is so bad either. According to latencytop, the culpritwould be genunix`cv_timedwait_sig rpcmod`svc
From my searching it appears that there's no async setting for theosol nfsd, and ESX does not offer any mount controls to force an asyncconnection. Other than putting in an SSD as a ZIL (which stillstrikes me as overkill for basic NFS services) I'm looking for anyinformation that can bring me up to at least reasonable throughput.
Would a dedicated 15K SAS drive help the situation by moving the ZILtraffic off to a dedicated device? Significantly? This is the sort ofthing that I don't want to do without some reasonable assurance thatit will help since you can't remove a ZIL device from a pool at themoment.
Hints and tips appreciated,

Erik
_______________________________________________
nfs-discuss mailing list
nfs-disc...@opensolaris.org


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [nfs-discuss] NFS, ZFS & ESX

Reply via email to