Miles Nordin wrote: >>>>>> "re" == Richard Elling <[EMAIL PROTECTED]> writes: >>>>>> > > re> unrecoverable read as the dominant disk failure mode. [...] > re> none of the traditional software logical volume managers nor > re> the popular open source file systems (other than ZFS :-) > re> address this problem. > > Other LVM's should address unrecoverable read errors as well or better > than ZFS, because that's when the drive returns an error instead of > data.
ZFS handles that case as well. > Doing a good job with this error is mostly about not freezing > the whole filesystem for the 30sec it takes the drive to report the > error. That is not a ZFS problem. Please file bugs in the appropriate category. > Either the drives should be loaded with special firmware that > returns errors earlier, or the software LVM should read redundant data > and collect the statistic if the drive is well outside its usual > response latency. ZFS will handle this case as well. > I would expect all the software volume managers > including ZFS fail to do this. It's really hard to test without > somehow getting a drive that returns read errors frequently, but isn't > about to die within the month---maybe ZFS should have an error > injector at driver-level instead of block-level, and a model for > time-based errors. qv ztest. Project comstar creates an opportunity for better testing in an open-source way. However, it will only work for SCSI protocol and therefore does not provide coverage for IDE devices -- which is not a long-term issue. > One thing other LVM's seem like they may do better > than ZFS, based on not-quite-the-same-scenario tests, is not freeze > filesystems unrelated to the failing drive during the 30 seconds it's > waiting for the I/O request to return an error. > This is not operating in ZFS code. > In terms of FUD about ``silent corruption'', there is none of it when > the drive clearly reports a sector is unreadable. Yes, traditional > non-big-storage-vendor RAID5, and all software LVM's I know of except > ZFS, depend on the drives to report unreadable sectors. And, > generally, drives do. so let's be clear about that and not try to imply > that the ``dominant failure mode'' causes silent corruption for > everyone except ZFS and Netapp users---it doesn't. > In my field data, the dominant failure mode for disks is unrecoverable reads. If your software does not handle this case, then you should be worried. We tend to recommend configuring ZFS to manage data redundancy for this reason. > The Netapp paper focused on when drives silently return incorrect > data, which is different than returning an error. Both Netapp and ZFS > do checksums to protect from this. However Netapp never claimed this > failure mode was more common than reported unrecoverable read errors, > just that it was more interesting. I expect it's much *less* common. > I would love for you produce data to that effect. > Further, we know Netapp loaded special firmware into the enterprise > drives in that study because they wanted the larger sector size. They > are likely also loading special firmware into the desktop drives to > make them return errors sooner than 30 seconds. so, it's not > improbable that the Netapp drives are more prone to deliver silently > corrupt data instead of UNC/seek errors compared to off-the-shelf > drives. > I am not sure of the basis of your assertion. Can you explain in more detail? > Finally, for the Google paper, silent corruption ``didn't even make > the chart.'' so, saying something didn't make your chart and saying > that it doesn't happen are two different things, and your favoured > conclusion has a stake in maintaining that view, too. > The google paper[1] didn't deal with silent errors or corruption at all. Section 2 describes in nice detail how they decided when a drive was failed -- it was replaced. They also cite disk vendors who test "failed" drives and many times the drives test clean (what they call "no problem found"). This is not surprising because it is unlikely that data corruption is detected in the systems under study. [1] http://www.cs.cmu.edu/~bianca/fast07.pdf -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss