Bob Friesenhahn wrote: > On Mon, 3 Mar 2008, Darren J Moffat wrote: > > >>> I'm not convinced that single bit flips are the common >>> failure mode for disks. Most enterprise class disks already >>> have enough ECC to correct at least 8 bytes per block. >>> >> and for consumer rather than enterprise class disks ? >> > > You are assuming that the ECC used for "consumer" disks is > substantially different than that used for "enterprise" disks. That > is likely not the case since ECC is provided by a chip which costs a > few dollars. The only reason to use a lesser grade algorithm would be > to save a small bit of storage space. > > Consumer disks use essentially the same media as enterprise disks. > > Consumer disks store a higher bit density on similar media. > > Consumer disks have less precise/consistent head controllers than > enterprise disks. > > Consumer disks are less well-specified than enterprise disks. > > Due to the higher bit density we can expect more wrong bits to be read > since we are pushing the media harder. Due to less consistent head > controllers we can expect more incidences of reading or writing the > wrong track or writing something which can't be read. Consumer disks > are often used in an environment where they may be physically > disturbed while they are writing or reading the data. Enterprise > disks are usually used in very stable environments. > > The upshot of this is that we can expect more unrecoverable errors, > but it seems unlikely that there will be more "single bit" errors > recoverable at the ZFS level. >
I agree, and am waiting to get the proceedings from FAST08 which has some interesting papers in the list. A while back I blogged about an Adaptec online seminar which addressed this topic. Rather than repeating what they said, I left a pointer and a recommendation. http://blogs.sun.com/relling/entry/adaptec_webinar_on_disks_and Also, note that the published reliability data from disk vendors is constantly changing. For laptop drives, we're seeing less MTBF or UER and more head landings specs. It seems that an important failure mode for laptop disks is wear out at the landing site. This is due to power management powering or spinning down the disk. We don't tend to see this failure mode in servers or RAID arrays. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss