Re: [zfs-discuss] Single disk parity

Richard Elling Wed, 08 Jul 2009 17:48:11 -0700

Haudy Kazemi wrote:

Daniel Carosone wrote:
Sorry, don't have a thread reference
to hand just now.
http://www.opensolaris.org/jive/thread.jspa?threadID=100296

Note that there's little empirical evidence that this is directly applicable to 
the kinds of errors (single bit, or otherwise) that a single failing disk 
medium would produce.  Modern disks already include and rely on a lot of ECC as 
part of ordinary operation, below the level usually seen by the host.  These 
mechanisms seem unlikely to return a read with just one (or a few) bit errors.

This strikes me, if implemented, as potentially more applicable to errors 
introduced from other sources (controller/bus transfer errors, non-ecc memory, 
weak power supply, etc).  Still handy.
Adding additional data protection options are commendable. On theother hand I feel there are important gaps in the existing feature setthat are worthy of a higher priority, not the least of which is theautomatic recovery of uberblock / transaction group problems (seeVictor Latushkin's recovery technique which I linked to in a recentpost),


This does not seem to be a widespread problem.  We do see the
occasional complaint on this forum, but considering the substantial
number of ZFS implementations in existence today, the rate seems
to be quite low.  In other words, the impact does not seem to be high.
Perhaps someone at Sun could comment on the call rate for such
conditions?

followed closely by a zpool shrink or zpool remove command that letsyou resize pools and disconnect devices without replacing them. I sawpostings or blog entries from about 6 months ago that this code was'near' as part of solving a resilvering bug but have not seen anythingelse since. I think many users would like to see improved resiliencein the existing features and the addition of frequently long requestedfeatures before other new features are added. (Exceptions can readilybe made for new features that are trivially easy to implement and/orare not competing for developer time with higher priority features.)
In the meantime, there is the copies flag option that you can use onsingle disks. With immense drives, even losing 1/2 the capacity tocopies isn't as traumatic for many people as it was in days gone by.(E.g. consider a 500 gb hard drive with copies=2 versus a 128 gbSSD). Of course if you need all that space then it is a no-go.


Space, performance, dependability: you can pick any two.

Related threads that also had ideas on using spare CPU cycles forbrute force recovery of single bit errors using the checksum:


There is no evidence that the type of unrecoverable read errors we
see are single bit errors.  And while it is possible for an error handling
code to correct single bit flips, multiple bit flips would remain as a
large problem space.  There are error codes which can correct multiple
flips, but they quickly become expensive.  This is one reason why nobody
does RAID-2.

BTW, if you do have the case where unprotected data is not
readable, then I have a little DTrace script that I'd like you to run
which would help determine the extent of the corruption.  This is
one of those studies which doesn't like induced errors ;-)
http://www.richardelling.com/Home/scripts-and-programs-1/zcksummon

The data we do have suggests that magnetic hard disk failures tend

to be spatially clustered. So there is still the problem of spatialdiversity

which is rather nicely handled by copies, today.
-- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Single disk parity

Reply via email to