On Tue, 18 Jul 2006, Al Hopper wrote:
On Tue, 18 Jul 2006, Daniel Rock wrote:
Richard Elling schrieb:
Jeff Bonwick wrote:
For 6 disks, 3x2-way RAID-1+0 offers better resiliency than RAID-Z
or RAID-Z2.
Maybe I'm missing something, but it ought to be the other way around.
With 6 disks, RAID-Z2 can tolerate any two disk failures, whereas
for 3x2-way mirroring, of the (6 choose 2) = 6*5/2 = 15 possible
two-disk failure scenarios, three of them are fatal.
For the 6-disk case, with RAID-1+0 you get 27/64 surviving states
versus 22/64 for RAID-Z2. This accounts for the cases where you could
lose 3 disks and survive with RAID-1+0.
I think this type of calculation is flawed. Disk failures are rare and
multiple disk failures at the same time are even more rare.
Stop right here! :) If you have a large number of identical disks which
operate in the same environment[1], and possibly the same enclosure, it's
quite likely that you'll see 2 or more disks die within the same,
relatively short, timeframe.
Also, with todays higher density disk enclosures, a fan failure, which
goes un-noticed for a period of time, is likely to affect more than one
drive - again leading to multiple disks failing in the same general
timeframe.
This is also why I advocate having cold spares available - so that the
probability of the spare failing within the same timeframe is greatly
diminished.
A good SMART implementation combined with a decent sensor framework can
also be useful for dealing with these conditions. Smartmontools is
currently able to send E-amil when the ambient temperature of a disk
drive goes beyond the recommended thresholds. I am hopeful the Solaris
SMART implementation will take temperature into account, since modern
disk drives run hot, and fan failures aren't all that uncommon.
- Ryan
--
UNIX Administrator
http://prefetch.net
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss