Re: [zfs-discuss] periodic slow responsiveness

Ross Walker Mon, 07 Sep 2009 09:01:53 -0700

On Sep 7, 2009, at 1:32 AM, James Lever <j...@jamver.id.au> wrote:

On 07/09/2009, at 10:46 AM, Ross Walker wrote:
zpool is RAIDZ2 comprised of 10 * 15kRPM SAS drives behind an LSI1078 w/ 512MB BBWC exposed as RAID0 LUNs (Dell MD1000 behind PERC6/E) with 2x SSDs each partitioned as 10GB slog and 36GB remainderas l2arc behind another LSI 1078 w/ 256MB BBWC (Dell R710 serverwith PERC 6/i).
This config might lead to heavy sync writes (NFS) starving readsdue to the fact that the whole RAIDZ2 behaves as a single disk onwrites. How about a 2 5 disk RAIDZ2s or 3 4 disk RAIDZs?
Just one or two other vdevs to spread the load can make the worldof difference.
This was a management decision. I wanted to go down the stripedmirrored pair solution, but the amount of space lost was consideredtoo great. RAIDZ2 was considered the best value option for ourenvironment.

Well a MD1000 holds 15 drives a good compromise might be 2 7 driveRAIDZ2s with a hotspare... That should provide 320 IOPS instead of160, big difference.

The system is configured as an NFS (currently serving NFSv3),iSCSI (COMSTAR) and CIFS (using the SUN SFW package running Samba3.0.34) with authentication taking place from a remote openLDAPserver.
There are a lot of services here, all off one pool? You might betrying to bite off more then the config can chew.
That’s not a lot of services, really. We have 6 users doing buildson multiple platforms and using the storage as their home directory(windows and unix).


Ok, six users, but what happens during a build?

The issue is interactive responsiveness and if there is a way totune the system to give that while still having good performance forbuilds when they are run.

Look at the write IOPS of the pool with the zpool iostat -v and lookat how many are happening on the RAIDZ2 vdev.

Try taking a particularly bad problem station and configuring itstatic for a bit to see if it is.
That has been considered also, but the issue has also been observedlocally on the fileserver.

Then I suppose you have eliminated automounter as a culprit at thispoint then.

That doesn't make a lot of sense to me the L2ARC is secondary readcache, if writes are starving reads then the L2ARC would only helphere.
I was suggesting that slog write were possibly starving reads fromthe l2arc as they were on the same device. This appears not to havebeen the issue as the problem has persisted even with the l2arcdevices removed from the pool.

The SSD will handle a lot more IOPS then the pool and L2ARC is a lazyreader, it mostly just holds on to read cache data.

It just may be that the pool configuration just can't handle thewrite IOPS needed and reads are starving.
Possible, but hard to tell. Have a look at the iostat results I’veposted.

The busy times of the disks while the issue is occurring should letyou know.


-Ross

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] periodic slow responsiveness

Reply via email to