On Aug 27, 2008, at 11:17 AM, Richard Elling wrote:
>>>>   In my pile of broken parts, I have devices
> which fail to indicate an unrecoverable read, yet do indeed suffer
> from forgetful media.

A long time ago, in a hw company long since dead and buried, I spent  
some months trying to find an intermittent error in the last bits of  
a complicated floating point application. It only occurred when disk  
striping was turned on (but the OS and device codes checked cleanly).  
In the end, it turned out that one of the device vendors had modified  
the specification slightly (by like 1 nano-sec) and the result was  
that least significant bits were often wrong when we drove the disk  
cage to it's max.

Errors were occurring randomly (e.g. swapping, paging, etc.) but no  
other application noticed. As the error was "within the margin of  
error" a less stubborn analyst might have not made a serious of  
federal cases about the non-determinism ;>

My point is that undetected errors happen all the time; that people  
don't notice doesn't mean that they don't happen ...


-- 
Keith H. Bierman   [EMAIL PROTECTED]      | AIM kbiermank
5430 Nassau Circle East                  |
Cherry Hills Village, CO 80113           | 303-997-2749
<speaking for myself*> Copyright 2008




_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to