Hello.

We're currently using a Sun Blade1000 (2x750MHz, 1G ram, 2x160MB/s mpt
scsi buses, skge GigE network) as a NFS backend with ZFS for
distribution of free software like Debian (cdimage.debian.org,
ftp.se.debian.org) and have run into some performance issues.

We are running SX snv_48 and have run with a raidz2 with 7x300G for a
while now, just added another 7x300G raidz2 today but I'll stick to old
information so far. Tried Sol10u2 before, but nfs writes killed every
bit of performance, snv_48 works much better in that regard.

Working data set is about 1.2TB over ~550k inodes right now. Backend
serves data to 2-4 linux frontends running Apache (with local raid0
mod_disk_cache), rsync (looking through entire debian trees every now
and then) and vsftp (not used much).

There are (at least?) two types of performance issues we've run into..

1. DNLC-through-ZFS doesn't seem to listen to ncsize.

The filesystem currently has ~550k inodes and large portions of it is
frequently looked over with rsync (over nfs). mdb said ncsize was about
68k and vmstat -s  said we had a hitrate of ~30%, so I set ncsize to
600k and rebooted.. Didn't seem to change much, still seeing hitrates at
about the same and manual find(1) doesn't seem to be that cached
(according to vmstat and dnlcsnoop.d).
When booting, the following message came up, not sure if it matters or not:
NOTICE: setting nrnode to max value of 351642
NOTICE: setting nrnode to max value of 235577

Is there a separate ZFS-DNLC knob to adjust for this? Wild guess is that
it has its own implementation which is integrated with the rest of the
ZFS cache which throws out metadata cache in favour of data cache.. or
something..


2. Readahead or something is killing all signs of performance

Since there can be pretty many requests in the air at the same time,
we're having issues with readahead..
Some regular numbers are 7x13MB/s being read from disk according to
'iostat -xnzm 5' and 'zpool iostat -v 5', and maybe 5MB/s is being sent
back over the network.. This means that about 20x more is read from disk
than actually being used. When testing single streams, the readahead
helps and data isn't thrown away.. but when a bazilion nfs requests come
at once, too much is being read by zfs compared to what was actually
requested/being delivered.

I saw some stuff about zfs_prefetch_disable in current ("unreleased")
code, will this help us perhaps? I've read about two layers of prefetch,
one per vdev and one per disk.. Since the current working set is about
1.2TB, 1GB memory in the server and "lots of one-shot file requests"
nature, we'd like to disable as much readahead and data cache as
possible (since the chance of a positive data cache hit is very low)..
Keeping dnlc stuff in memory would help though.


Some URLs:

zfs_prefetch_disable being integrated:
http://dlc.sun.com/osol/on/downloads/current/on-changelog-20061103.html

zfs_prefetch_disable itself
http://src.opensolaris.org/source/search?q=zfs_prefetch_disable&defs=&refs=&path=&hist=

Soft Track Buffer / Prefetch:
http://blogs.sun.com/roch/entry/the_dynamics_of_zfs

As far as I've been able to tell using mdb, this is already lowered in b48?
http://blogs.sun.com/roch/entry/tuning_the_knobs


Suggestions, ideas etc?


/Tomas
-- 
Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to