Hello. We're currently using a Sun Blade1000 (2x750MHz, 1G ram, 2x160MB/s mpt scsi buses, skge GigE network) as a NFS backend with ZFS for distribution of free software like Debian (cdimage.debian.org, ftp.se.debian.org) and have run into some performance issues.
We are running SX snv_48 and have run with a raidz2 with 7x300G for a while now, just added another 7x300G raidz2 today but I'll stick to old information so far. Tried Sol10u2 before, but nfs writes killed every bit of performance, snv_48 works much better in that regard. Working data set is about 1.2TB over ~550k inodes right now. Backend serves data to 2-4 linux frontends running Apache (with local raid0 mod_disk_cache), rsync (looking through entire debian trees every now and then) and vsftp (not used much). There are (at least?) two types of performance issues we've run into.. 1. DNLC-through-ZFS doesn't seem to listen to ncsize. The filesystem currently has ~550k inodes and large portions of it is frequently looked over with rsync (over nfs). mdb said ncsize was about 68k and vmstat -s said we had a hitrate of ~30%, so I set ncsize to 600k and rebooted.. Didn't seem to change much, still seeing hitrates at about the same and manual find(1) doesn't seem to be that cached (according to vmstat and dnlcsnoop.d). When booting, the following message came up, not sure if it matters or not: NOTICE: setting nrnode to max value of 351642 NOTICE: setting nrnode to max value of 235577 Is there a separate ZFS-DNLC knob to adjust for this? Wild guess is that it has its own implementation which is integrated with the rest of the ZFS cache which throws out metadata cache in favour of data cache.. or something.. 2. Readahead or something is killing all signs of performance Since there can be pretty many requests in the air at the same time, we're having issues with readahead.. Some regular numbers are 7x13MB/s being read from disk according to 'iostat -xnzm 5' and 'zpool iostat -v 5', and maybe 5MB/s is being sent back over the network.. This means that about 20x more is read from disk than actually being used. When testing single streams, the readahead helps and data isn't thrown away.. but when a bazilion nfs requests come at once, too much is being read by zfs compared to what was actually requested/being delivered. I saw some stuff about zfs_prefetch_disable in current ("unreleased") code, will this help us perhaps? I've read about two layers of prefetch, one per vdev and one per disk.. Since the current working set is about 1.2TB, 1GB memory in the server and "lots of one-shot file requests" nature, we'd like to disable as much readahead and data cache as possible (since the chance of a positive data cache hit is very low).. Keeping dnlc stuff in memory would help though. Some URLs: zfs_prefetch_disable being integrated: http://dlc.sun.com/osol/on/downloads/current/on-changelog-20061103.html zfs_prefetch_disable itself http://src.opensolaris.org/source/search?q=zfs_prefetch_disable&defs=&refs=&path=&hist= Soft Track Buffer / Prefetch: http://blogs.sun.com/roch/entry/the_dynamics_of_zfs As far as I've been able to tell using mdb, this is already lowered in b48? http://blogs.sun.com/roch/entry/tuning_the_knobs Suggestions, ideas etc? /Tomas -- Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss