Re: [zfs-discuss] # devices in raidz.

Richard Elling - PAE Fri, 03 Nov 2006 15:46:10 -0800

Al Hopper wrote:

[1] Using MTTDL = MTBF^2 / (N * (N-1) * MTTR)


But ... I'm not sure I buy into your numbers given the probability that
more than one disk will fail inside the service window - given that the
disks are identical?  Or ... a disk failure occurs at 5:01 PM (quitting
time) on a Friday and won't be replaced until 8:00AM on Monday morning.
Does the failure data you have access to support my hypothesis that
failures of identical mechanical systems tend to occur in small clusters
within a relatively small window of time?


Separating the right hand side:
        MTTDL = MTBF/N * MTBF/(N-1)*MTTR

the right-most product is the probability that one of the N-1 disks fail
during the recovery window for the first disk's failure.  As the MTTR
increases, the probability of the 2nd disk failure also increases.
RAIDoptimizer calculates the MTTR as:
        MTTR = service response time + resync time
where
        resync time = size * space used (%) / resync rate

Incidentally, since ZFS schedules the resync iops itself, then it can
really move along on a mostly idle system.  You should be able to resync
at near the media speed for an idle system.  By contrast, a hardware
RAID array has no knowledge of the context of the data or the I/O scheduling,
so they will perform resyncs using a throttle.  Not only do they end up
resyncing unused space, but they also take a long time (4-18 GBytes/hr for
some arrays) and thus expose you to a higher probability of second disk
failure.

Call me paranoid, but I'd prefer to see a product like thumper configured
with 50% of the disks manufactured by vendor A and the other 50%
manufactured by someone else.


Diversity is usually a good thing.  Unfortunately, this is often impractical
for a manufacturer.

This paranoia is based on a personal experience, many years ago (before we
had smart fans etc), where we had a rack full of expensive custom
equipment cooled by (what we thought was) a highly redundant group of 5
fans.  One fan suffered infant mortality and its failure went unnoticed,
leaving 4 fans running.  Two of the fans died on the same extended weekend
(public holiday).  It was an expensive and embarassing disaster.


Modelling such as this assumes independence of failures.  Common cause or
bad lots are not that hard to model, but you may never find any failure rate
data for them.  You can look at the MTBF sensitivities, though that is an
opening to another set of results.  I prefer to ignore the absolute values
and judge competing designs by their relative results.  To wit, I fully
expect to be beyond dust in 150,767 years, and the expected lifetime of
most disks is 5 years.  But given two competing designs using the same
model, a design predicting and MTTDL 150,767 years will very likely demonstrate
better MTTDL than a design predicting 68,530 years.
 -- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] # devices in raidz.

Reply via email to