Hello Chris, Friday, May 9, 2008, 9:19:53 PM, you wrote:
CS> I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am seeing CS> a weird performance degradation as the number of simultaneous sequential CS> reads increases. CS> Setup: CS> NFS client -> Solaris NFS server -> iSCSI target machine CS> There are 12 physical disks on the iSCSI target machine. Each of them CS> is sliced up into 11 parts and the parts exported as individual LUNs to CS> the Solaris server. The Solaris server uses each LUN as a separate ZFS CS> pool (giving 132 pools in total) and exports them all to the NFS client. CS> (The NFS client and the iSCSI target machine are both running Linux. CS> The Solaris NFS server has 4 GB of RAM.) CS> When the NFS client starts a sequential read against one filesystem CS> from each physical disk, the iSCSI target machine and the NFS client CS> both use the full network bandwidth and each individual read gets CS> 1/12th of it (about 9.something MBytes/sec). Starting a second set of CS> sequential reads against each disk (to a different pool) behaves the CS> same, as does starting a third set. CS> However, when I add a fourth set of reads thing change; while the CS> NFS server continues to read from the iSCSI target at full speed, the CS> data rate to the NFS client drops significantly. By the time I hit CS> 9 reads per physical disk, the NFS client is getting a *total* of 8 CS> MBytes/sec. In other words, it seems that ZFS on the NFS server is CS> somehow discarding most of what it reads from the iSCSI disks, although CS> I can't see any sign of this in 'vmstat' output on Solaris. Keep in mind that you will end up with a lot of seeks on physical drives once you do multiple sqeuntial reads from differnt disk regions. Nevertheless I wouldn't expect much difference in throughput between nfs client and iscsi server. I'm thinking that maybe you are hitting the issue with vdev cache as probably you ended up with 8KB reads over nfs (RSIZE) and 64KB reads from iSCSI. You have 4GB of ram and I'm assuming most of it is free (used by ARC cache)... or maybe it is actually not the case so vdev cache reads 64KB, nfs client reads 8KB and by the time it asks for another 8KB it is already gone... since your box "locks up" - maybe iscsi target or other application has a memory leak? Is your system using swap device just before it "locks up"? Try on fs client to mount filesystems with RSIZE=32KB and make sure your scripts/programs are also requesting at least 32KB at the time. Check if it help. If it doesn't then disable vdev cache on solaris box (by setting zfs_vdev_cache_max to 1). And check again. CS> (It is limited testing because it is harder to accurately measure what CS> aggregate data rate I'm getting and harder to run that many simultaneous CS> reads, as if I run too many of them the Solaris machine locks up due to CS> overload.) that's strange - what exactly happens when it "locks up"? Does it panic? CS> smaller loads)? Would partitioning the physical disks on Solaris instead CS> of splitting them up on the iSCSI target make a significant difference? Why do you want to partition them in a first place? Why not present each disk as an iscsi lun then create a pool out of it and if necessary create multiple file systems inside. Then what about data protection - don't you want to use any RAID? -- Best regards, Robert Milkowski mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss