User Name wrote: > I am building a 14 disk raid 6 array with 1 TB seagate AS (non-enterprise) > drives. > > So there will be 14 disks total, 2 of them will be parity, 12 TB space > available. > > My drives have a BER of 10^14 > > I am quite scared by my calculations - it appears that if one drive fails, > and I do a rebuild, I will perform: > > 13*8*10^12 = 104000000000000 > > reads. But my BER is smaller: > > 10^14 = 100000000000000 > > So I am (theoretically) guaranteed to lose another drive on raid rebuild. > Then the calculation for _that_ rebuild is: > > 12*8*10^12 = 96000000000000 > > So no longer guaranteed, but 96% isn't good. > > I have looked all over, and these seem to be the accepted calculations - > which means if I ever have to rebuild, I'm toast. >
If you were using RAID-5, you might be concerned. For RAID-6, or at least raidz2, you could recover an unrecoverable read during the rebuild of one disk. > But here is the question - the part I am having trouble understanding: > > The 13*8*10^12 operations required for the first rebuild .... isn't that the > number for _the entire array_ ? Any given 1 TB disk only has 10^12 bits on > it _total_. So why would I ever do more than 10^12 operations on the disk ? > Actually, ZFS only rebuilds the data. So you need to multiply by the space utilization of the pool, which will usually be less than 100%. > It seems very odd to me that a raid controller would have to access any given > bit more than once to do a rebuild ... and the total number of bits on a > drive is 10^12, which is far below the 10^14 BER number. > > So I guess my question is - why are we all doing this calculation, wherein we > apply the total operations across an entire array rebuild to a single drives > BER number ? > You might also be interested in this blog http://blogs.zdnet.com/storage/?p=162 A couple of things seem to be at work here. I study field data failure rates. We tend to see unrecoverable read failure rates at least an order of magnitude better than the specifications. This is a good thing, but simply points out that the specifications are often sand-bagged -- they are not a guarantee. However, you are quite right in your intuition that if you have a lot of bits of data, then you need to pay attention to the bit-error rate (BER) of unrecoverable reads on disks. This sort of model can be used to determine a mean time to data loss (MTTDL) as I explain here: http://blogs.sun.com/relling/entry/a_story_of_two_mttdl Perhaps it would help if we changed the math to show the risk as a function of the amount of data given the protection scheme? hmmm.... something like probability of data loss per year for N TBytes with configuration XYZ. Would that be more useful for evaluating configurations? -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss