Re: [zfs-discuss] Proposal: multiple copies of user data

Richard Elling - PAE Mon, 18 Sep 2006 20:17:00 -0700

more below...

David Dyer-Bennet wrote:

On 9/18/06, Richard Elling - PAE <[EMAIL PROTECTED]> wrote:

[appologies for being away from my data last week]

David Dyer-Bennet wrote:
> The more I look at it the more I think that a second copy on the same
> disk doesn't protect against very much real-world risk.  Am I wrong
> here?  Are partial(small) disk corruptions more common than I think?
> I don't have a good statistical view of disk failures.

This question was asked many times in this thread.  IMHO, it is the
single biggest reason we should implement ditto blocks for data.

We did a study of disk failures in an enterprise RAID array a few
years ago.  One failure mode stands heads and shoulders above the
others: non-recoverable reads.  A short summary:

   2,919 total errors reported

1,926 (66.0%) operations succeeded (eg. write failed, autoreallocated)

     961 (32.9%) unrecovered errors (of all types)
      32 (1.1%) other (eg. device not ready)
     707 (24.2%) non-recoverable reads

In other words, non-recoverable reads represent 73.6% of the non-
recoverable failures that occur, including complete drive failures.


I don't see anything addressing complete drive failures vs. block
failures here anywhere.   Is there some way to read something about
that out of this data?


Complete failures are a non-zero category, but there is more than one
error code which would result in the recommendation to replace the drive.
Their counts are included in the 961-707=254 (26.4%) of other non-
recoverable errors.  In some cases a non-recoverable error can be
corrected by a retry, and those also fall into the 26.4% bucket.

Interestingly, the operation may succeed and yet we will get an error
which recommends replacing the drive.  For example, if the failure
prediction threshold is exceeded.  You might also want to replace the
drive when there are no spare defect sectors available.  Life would be
easier if they really did simply die.

I'm thinking the "operations succeeded" also occurs read errors
recovered by retries and such, as well as the write failure cited as
an example?


Yes.

I guess I can conclude that the 66% for errors successfully recovered
means that a lot of errors are not, in fact, entire-drive failures.
So that's good (for ditto-data).  So a maximum of 34% are whole-drive
failures (and in reality I'm sure far lower).


I agree.

Anyway, facts on actual failures in the real world are *definitely*
the useful way to conduct this discussion!

[snip]

While it is true that I could slice my disk up into multiple vdevs and
mirror them, I'd much rather set a policy at a finer grainularity: my
files are more important than most of the other, mostly read-only and
easily reconstructed, files on my system.


I definitely like the idea of setting policy at a finer granularity; I
really want it to be at the file level, even per-directory doesn't fit
reality very well in my view.

When ditto blocks for metadata was introduced, I took a look at the
code and was pleasantly suprised.  The code does an admirable job of
ensuring spatial diversity in the face of multiple policies, even in
the single disk case.  IMHO, this is the right way to implement this
and allows you to mix policies with ease.


That's very good to hear.


 -- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Proposal: multiple copies of user data

Reply via email to