On Thu, Jun 02, 2011 at 11:39:13AM -0700, Donald Stahl wrote: > > Yup; reset storms affected us as well (we were using the X-25 series > > for ZIL/L2ARC). Only the ZIL drives were impacted, but it was a large > > impact :) > What did you see with your reset storm? Were there log errors in > /var/adm/messages or did you need to check the controller loogs with > something like lsi util?
Yep, /var/adm/messages had Unit Attention errors. Ref: http://markmail.org/message/5rmfzvqwlmosh2oh > Did the reset workaround in the blog post help? We re-architected before reading the blog post, so I'm unsure if it would have helped or not. In any case, moving the SSD's internal lets us use additional hot-swappable data disks, so it was beneficial in other areas as well. > > The expanders you were using were SAS/SATA expanders? Or SAS expanders > with adapters on the drive to allow the use of SATA disks? The expander was a SuperMicro SAS-846EL1 which is a SAS expander but has SFF-8482 connectors to provide compatability with SATA drives. > > I've been using 4 X-25E's with Promise J610sD SAS shelves and the > AAMUX adapters and have yet to have a problem. It definitely seemed itermittent, and various suggestions we received indicated we might need to downgrade our backplane/expander's firmware. Never did try that, but it wouldn't surprise me if behavior was better/worse on different backplanes... > > > Our solution was to move the SSD's off of the expander and remount > > internally attached via one of the LSI SAS ports directly (we also had > > problems with running the drives directly off the on-board SATA ports > > on our SuperMicro motherboards -- occasionally the entire zpool would > > freeze up). > > I'm surprised you had problems with the internal SATA ports as well- > any idea what was causing the problems there? Nope. I posted this: http://mail.opensolaris.org/pipermail/zfs-discuss/2010-October/045625.html But got no responses. We resolved the NFS errors (which I believe were coincidental), but the watchdog port issues kept reoccurring without rhyme or reason. The box itself wouldn't lock up, but the zpool would become non-resopnsive and we'd have to hard reset. This was all production stuff, so as soon as we were able to, we ditched using the SATA ports entirely instead of pursuing a fix with Sun. Ray _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss