> Yup, was an absolute nightmare to diagnose on top of everything else.  
> Definitely doesn't 
> happen in windows too.  I really want somebody to try snv_94 on a Thumper to 
> see if you
> get the same behaviour there, or whether it's unique to Supermicro's Marvell 
> card.

On a Thumper under S10U5 we recently had a hardware failure
of one disk. This caused all I/O to the entire 46 disk pool to hang.
zpool status commands also were hanging. Reset commands
from the service processor timed out unsuccessfully. The system
had to be power cycled manually. After that booting took about
30 minutes. At this point the bad disk could be unconfigured
with cfgadm and then hot swapped with a warranty replacement.

So it appears that bug 6735931 is also affecting the X4500 upon disk
hardware failure; in a way that seriously impairs the entire system's 
fault tolerance. 

I would be willing to test any T-patch coming out soon....

I found this thread after seeing a total failure of a hot unplug
of a 1.5TB disk from a (different) newly assembled system with 3 AOC-SAT2-MV8
cards and 24 disks + one host spare. After removing one disk
the entire system also froze; instead of initiating a resilver
process with the hot spare. Clearly the marvell88sx driver cannot handle
disk outages in any environment.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to