On Aug 27, 2008, at 11:17 AM, Richard Elling wrote: >>>> In my pile of broken parts, I have devices > which fail to indicate an unrecoverable read, yet do indeed suffer > from forgetful media.
A long time ago, in a hw company long since dead and buried, I spent some months trying to find an intermittent error in the last bits of a complicated floating point application. It only occurred when disk striping was turned on (but the OS and device codes checked cleanly). In the end, it turned out that one of the device vendors had modified the specification slightly (by like 1 nano-sec) and the result was that least significant bits were often wrong when we drove the disk cage to it's max. Errors were occurring randomly (e.g. swapping, paging, etc.) but no other application noticed. As the error was "within the margin of error" a less stubborn analyst might have not made a serious of federal cases about the non-determinism ;> My point is that undetected errors happen all the time; that people don't notice doesn't mean that they don't happen ... -- Keith H. Bierman [EMAIL PROTECTED] | AIM kbiermank 5430 Nassau Circle East | Cherry Hills Village, CO 80113 | 303-997-2749 <speaking for myself*> Copyright 2008 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss