On Wed, Dec 06, 2006 at 01:11:06PM -0600, Mike McCarty wrote: > Andrew Sackville-West wrote: > >On Wed, Dec 06, 2006 at 02:52:29PM +0100, Johannes Wiedersich wrote: > > > >>Question: how likely is it that both disks develop bad blocks, while > >>none of them is damaged? I'm no expert on this, but I guess a better > >>strategy might be to rotate backups on two disks, and use (and check: > >>fsck and smartctl) them reguarly. > > > >if the chance of a disk failure is (say) 1% in the time alloted, then > >the chance of having a failure with disks is 2%. THe change of any one > > I don't follow this reasoning. Are you presuming independence of the > failures and identical probabilities? If so, then this is the way to > compute it: > > Let p be the probability of failure of each disc, independently of the > other. There are four mutually independent events which comprise the > space. Both discs may fail [Pr = p^2]. The first disc may fail, while > the second does not [Pr = p(1-p)]. The second disc may fail, while the > first does not [Pr = (1-p)p]. Both discs may survive [Pr = (1-p)(1-p)]. > > So, the probability that at least one disc fails is 1-(1-p)(1-p). > For p = 0.01, that is 0.0199. > > I'll grant you this is not markedly different from 2%, but it is also > not simply 2p. > > >particular disk failing is still 1%, it the odds of A failure in the > >system as a whole that goes up. So with more disks you're more likely > >to have failures of some kind, but the per disk failure stays the same > >and the odds of losing ALL of them goes the other way. The odds of > >losing BOTH disks is .1%. the question becomes, which one has > >failed... > > I don't follow this reasoning. The probability of both discs failing > (if they do so independently) is not 0.1%, but rather 0.01%. A partially > failed disc is usually easy to detect, since they have FEC on them. A > completely failed disc is even easier to detect :-) >
Mike, Without expending any mathematical energy, could you recompute your two probabilities based on a set of three disks instead of 2? I'm guessing that the probability of one disk failing goes up but the probability of all three failing drops substantially (the famious tripple-redundancy theory). I'm assuming that a partialy failed disk will return good data (because of the FEC) and that an error notice ends up in syslog (do you know the severity)? How does a raid1 array handle a partially failing disk? Does it just take the good data and carry on until the drive completly fails or does mdadm also get involved in issuing a warning of a failing drive? Thanks, Doug. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]