User Name wrote:
> I am building a 14 disk raid 6 array with 1 TB seagate AS (non-enterprise) 
> drives.
>
> So there will be 14 disks total, 2 of them will be parity, 12 TB space 
> available.
>
> My drives have a BER of 10^14
>
> I am quite scared by my calculations - it appears that if one drive fails, 
> and I do a rebuild, I will perform:
>
> 13*8*10^12 = 104000000000000
>
> reads.  But my BER is smaller:
>
> 10^14 = 100000000000000
>
> So I am (theoretically) guaranteed to lose another drive on raid rebuild.  
> Then the calculation for _that_ rebuild is:
>
> 12*8*10^12 = 96000000000000
>
> So no longer guaranteed, but 96% isn't good.
>
> I have looked all over, and these seem to be the accepted calculations - 
> which means if I ever have to rebuild, I'm toast.
>   

If you were using RAID-5, you might be concerned.  For RAID-6,
or at least raidz2, you could recover an unrecoverable read during
the rebuild of one disk.

> But here is the question - the part I am having trouble understanding:
>
> The 13*8*10^12 operations required for the first rebuild .... isn't that the 
> number for _the entire array_ ?  Any given 1 TB disk only has 10^12 bits on 
> it _total_.  So why would I ever do more than 10^12 operations on the disk ?
>   

Actually, ZFS only rebuilds the data.  So you need to multiply by
the space utilization of the pool, which will usually be less than
100%.

> It seems very odd to me that a raid controller would have to access any given 
> bit more than once to do a rebuild ... and the total number of bits on a 
> drive is 10^12, which is far below the 10^14 BER number.
>
> So I guess my question is - why are we all doing this calculation, wherein we 
> apply the total operations across an entire array rebuild to a single drives 
> BER number ?
>   

You might also be interested in this blog
http://blogs.zdnet.com/storage/?p=162

A couple of things seem to be at work here.  I study field data
failure rates.  We tend to see unrecoverable read failure rates
at least an order of magnitude better than the specifications.
This is a good thing, but simply points out that the specifications
are often sand-bagged -- they are not a guarantee.  However,
you are quite right in your intuition that if you have a lot of
bits of data, then you need to pay attention to the bit-error
rate (BER) of unrecoverable reads on disks. This sort of model
can be used to determine a mean time to data loss (MTTDL) as
I explain here:
    http://blogs.sun.com/relling/entry/a_story_of_two_mttdl

Perhaps it would help if we changed the math to show the risk
as a function of the amount of data given the protection scheme? 
hmmm.... something like probability of data loss per year for
N TBytes with configuration XYZ.  Would that be more
useful for evaluating configurations?
 -- richard


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to