Hi Charles, There are quite a few bugs in b134 that can lead to this. Alas, due to the new regime, there was a period of time where the distributions were not being delivered. If I were in your shoes, I would upgrade to OpenIndiana b147 which has 26 weeks of maturity and bug fixes over b134.
http://www.openindiana.org -- richard On Sep 23, 2010, at 2:48 PM, Charles J. Knipe wrote: > So, I'm still having problems with intermittent hangs on write with my ZFS > pool. Details from my original post are below. Since posting that, I've > gone back and forth with a number of you, and gotten a lot of useful advice, > but I'm still trying to get to the root of the problem so I can correct it. > Since the original post I have: > > -Gathered a great deal of information in the form of kernel thread dumps, > zio_state dumps, and live crash dumps while the problem is happening. > -Been advised that my ruling out of dedupe was probably premature, as I still > likely have a good deal of deduplicated data on-disk. > -Checked just about every log and counter that might indicate a hardware > error, without finding one. > > I was wondering at this point if someone could give me some pointers on the > following: > 1. Given the dumps and diagnostic data I've gathered so far, is there a way I > can determine for certain where in the ZFS driver I'm spending so much time > hanging? At the very least I'd like to try to determine whether it is, > in-fact a deduplication issue. > 2. If it is, in fact, a deduplication issue, would my only recourse be a new > pool and a send/receive operation? The data we're storing is VMFS volumes > for ESX. We're tossing around the idea of creating new volumes in the same > pool (now that dedupe is off) and migrating VMs over in small batches. The > theory is that we would be writing non-deduped data this way, and when we > were done we could remove the deduplicated volumes. Is this sound? > > Thanks again for all the help! > > -Charles > >> Howdy, >> >> We're having a ZFS performance issue over here that I >> was hoping you guys could help me troubleshoot. We >> have a ZFS pool made up of 24 disks, arranged into 7 >> raid-z devices of 4 disks each. We're using it as an >> iSCSI back-end for VMWare and some Oracle RAC >> clusters. >> >> Under normal circumstances performance is very good >> both in benchmarks and under real-world use. Every >> couple days, however, I/O seems to hang for anywhere >> between several seconds and several minutes. The >> hang seems to be a complete stop of all write I/O. >> The following zpool iostat illustrates: >> >> pool0 2.47T 5.13T 120 0 293K 0 >> pool0 2.47T 5.13T 127 0 308K 0 >> pool0 2.47T 5.13T 131 0 322K 0 >> pool0 2.47T 5.13T 144 0 347K 0 >> pool0 2.47T 5.13T 135 0 331K 0 >> pool0 2.47T 5.13T 122 0 295K 0 >> pool0 2.47T 5.13T 135 0 330K 0 >> >> While this is going on our VMs all hang, as do any >> "zfs create" commands or attempts to touch/create >> files in the zfs pool from the local system. After >> several minutes the system "un-hangs" and we see very >> high write rates before things return to normal >> across the board. >> >> Some more information about our configuration: We're >> running OpenSolaris svn-134. ZFS is at version 22. >> Our disks are 15kRPM 300gb Seagate Cheetahs, mounted >> in Promise J610S Dual enclosures, hanging off a Dell >> SAS 5/e controller. We'd tried out most of this >> configuration previously on OpenSolaris 2009.06 >> without running into this problem. The only thing >> that's new, aside from the newer OpenSolaris/ZFS is >> a set of four SSDs configured as log disks. >> >> At first we blamed de-dupe, but we've disabled that. >> Next we suspected the SSD log disks, but we've seen >> the problem with those removed, as well. >> >> Has anyone seen anything like this before? Are there >> any tools we can use to gather information during the >> hang which might be useful in determining what's >> going wrong? >> >> Thanks for any insights you may have. >> >> -Charles > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss