On Oct 8, 2010, at 4:33 AM, Edward Ned Harvey wrote:

>> From: Peter Jeremy [mailto:peter.jer...@alcatel-lucent.com]
>> Sent: Thursday, October 07, 2010 10:02 PM
>> 
>> On 2010-Oct-08 09:07:34 +0800, Edward Ned Harvey <sh...@nedharvey.com>
>> wrote:
>>> If you're going raidz3, with 7 disks, then you might as well just make
>>> mirrors instead, and eliminate the slow resilver.
>> 
>> There is a difference in reliability:  raidzN means _any_ N disks can
>> fail, whereas mirror means one disk in each mirror pair can fail.
>> With a mirror, Murphy's Law says that the second disk to fail will be
>> the pair of the first disk :-).
> 
> Maybe.  But in reality, you're just guessing the probability of a single
> failure, the probability of multiple failures, and the probability of
> multiple failures within the critical time window and critical redundancy
> set.
> 
> The probability of a 2nd failure within the critical time window is smaller
> whenever the critical time window is decreased, and the probability of that
> failure being within the critical redundancy set is smaller whenever your
> critical redundancy set is smaller.  So if raidz2 takes twice as long to
> resilver than a mirror, and has a larger critical redundancy set, then you
> haven't gained any probable resiliency over a mirror.
> 
> Although it's true with mirrors, it's possible for 2 disks to fail and
> result in loss of pool, I think the probability of that happening is smaller
> than the probability of a 3-disk failure in the raidz2.
> 
> How much longer does a 7-disk raidz2 take to resilver as compared to a
> mirror?  According to my calculations, it's in the vicinity of 10x longer.  
> 

This article has been posted elsewhere, is about 10 months old, but is a good 
read:

http://queue.acm.org/detail.cfm?id=1670144



Really, there should be a ballpark / back of the napkin formula to be able to 
calculate this?  I've been curious about this too, so here goes a 1st cut...



DR = disk reliability, in terms of chance of the disk dying in any given time 
period, say any given hour?

DFW = disk full write - time to write every sector on the disk.  This will vary 
depending on system load, but is still an input item that can be determined by 
some testing.


RSM = resilver time for a mirror of two of the given disks
RSZ1 = resilver time for raidz1 vdev of two of the given disks?
RSZ2 = resilver time for raidz2 vdev of two of the given disks?


chances of losing all data in a mirror: DLM = RSM * DR.
chances of losing all data in a raiz1: DLRZ1 = RSZ1 * DR.
chances of losing all data in a raidz2: DLRZ2 = RSZ2 * DR * DR



Now, for the above, I'll make some other assumptions...


Lets just guess at a 1-year MTBF for our disks, and for purposes here, just 
flat line that at a failure rate  of chance per hour throughout the year.

Lets presume rebuilding a mirror takes one hour.
Lets presume that a 7-disk raidz1 takes 24 times longer to rebuild one disk 
than a mirror, I think this would be a 'safe' ratio to the benefit of the 
mirror.
Lets presume that a 7-disk raidz2 takes 72 times longer to rebuild one disk 
than a mirror, this should be 'safe' and again benefit to the mirror.




DR for a one hour period = 1 / 24 hours / 365 day = .000114 - chance a disk 
might die in any given hour.


DLM = one hour * DR = .000114

DLRZ1 = 24 hours * DR = .0001114 * 6 ( x6 because there are six more drives in 
the pool, and any one of them could fail)

DLRZ2 = 72 hours * DR * DR = (72 * (.0001114 * 6-disks) * (.0001114 * 5 disks)  
= a much tinier chance of losing all that data.





A better way to think about it maybe....

Based on our 1-year flat-line MTBF for disks, to figure out how much faster the 
mirror must rebuild for reliability to be the same as a raidz2...

DLM = DLRZ2

.0001114 * 1 hour = X hours * (.0001114 * 6-disks) * (.0001114 * 5 disks)

X = (.0001114 * 6-disks) * 5 

X = .003342

So, the mirror would have to resilver three hundred times faster than the raiz2 
 (1 / .003342) in order for it to offer the same levels of reliability in 
regards to the chances of losing the entire vdev due to additional disk 
failures during a resilver?





The governing thing here is that O(2) level of reliability based on expected 
chances of failure of  additional disks during any given moment in time, vs. 
O(1) for mirrors and raidz1?

Note that the above is O(2) for raidz2 and O(1) for mirror/raidz1, because we 
are working on the assumption we have already lost one disk.

With raidz3, we would have ( 1  /  (.0001114 * 4-disks remaining in pool ), or 
about 2,000 times more reliability?




Now, the above does not include things like proper statistics that the chances 
of that 2nd and 3rd disk failing (even correlations) may be higher than our 
'flat-line' %/hr. based on 1-year MTBF, or stuff like if all the disks were 
purchased in the same lots and at the same time, so their chances of failing 
around the same time is higher, etc.
















_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to