Hello Chris,

Friday, May 9, 2008, 9:19:53 PM, you wrote:

CS>  I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am seeing
CS> a weird performance degradation as the number of simultaneous sequential
CS> reads increases.

CS>  Setup:
CS>         NFS client -> Solaris NFS server -> iSCSI target machine

CS>  There are 12 physical disks on the iSCSI target machine. Each of them
CS> is sliced up into 11 parts and the parts exported as individual LUNs to
CS> the Solaris server. The Solaris server uses each LUN as a separate ZFS
CS> pool (giving 132 pools in total) and exports them all to the NFS client.

CS> (The NFS client and the iSCSI target machine are both running Linux.
CS> The Solaris NFS server has 4 GB of RAM.)

CS>  When the NFS client starts a sequential read against one filesystem
CS> from each physical disk, the iSCSI target machine and the NFS client
CS> both use the full network bandwidth and each individual read gets
CS> 1/12th of it (about 9.something MBytes/sec). Starting a second set of
CS> sequential reads against each disk (to a different pool) behaves the
CS> same, as does starting a third set.

CS>  However, when I add a fourth set of reads thing change; while the
CS> NFS server continues to read from the iSCSI target at full speed, the
CS> data rate to the NFS client drops significantly. By the time I hit
CS> 9 reads per physical disk, the NFS client is getting a *total* of 8
CS> MBytes/sec.  In other words, it seems that ZFS on the NFS server is
CS> somehow discarding most of what it reads from the iSCSI disks, although
CS> I can't see any sign of this in 'vmstat' output on Solaris.


Keep in mind that you will end up with a lot of seeks on physical
drives once you do multiple sqeuntial reads from differnt disk
regions.

Nevertheless I wouldn't expect much difference in throughput between
nfs client and iscsi server. I'm thinking that maybe you are hitting
the issue with vdev cache as probably you ended up with 8KB reads over
nfs (RSIZE) and 64KB reads from iSCSI. You have 4GB of ram and I'm
assuming most of it is free (used by ARC cache)... or maybe it is
actually not the case so vdev cache reads 64KB, nfs client reads 8KB
and by the time it asks for another 8KB it is already gone... since
your box "locks up" - maybe iscsi target or other application has a
memory leak? Is your system using swap device just before it "locks
up"?

Try on fs client to mount filesystems with RSIZE=32KB and make sure
your scripts/programs are also requesting at least 32KB at the time.
Check if it help. If it doesn't then disable vdev cache on solaris box
(by setting zfs_vdev_cache_max to 1). And check again.




CS> (It is limited testing because it is harder to accurately measure what
CS> aggregate data rate I'm getting and harder to run that many simultaneous
CS> reads, as if I run too many of them the Solaris machine locks up due to
CS> overload.)

that's strange - what exactly happens when it "locks up"? Does it
panic?


CS> smaller loads)? Would partitioning the physical disks on Solaris instead
CS> of splitting them up on the iSCSI target make a significant difference?

Why do you want to partition them in a first place? Why not present
each disk as an iscsi lun then create a pool out of it and if
necessary create multiple file systems inside.

Then what about data protection - don't you want to use any RAID?



-- 
Best regards,
 Robert Milkowski                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to