David Dyer-Bennet wrote:
The more I look at it the more I think that a second copy on the same disk doesn't protect against very much real-world risk. Am I wrong here? Are partial(small) disk corruptions more common than I think? I don't have a good statistical view of disk failures.
I don't have hard data at hand but you see entire drives go bad much more often then a single section....and when you do it is usually a notice of a block re-allocation from the disk drive firmware. Often you'll see a bunch of those, sometimes over the course of a month and sometimes over the course of a minute, and then the entire drive goes. In some cases a raid array will watch for those messages and automagically swap the drive with a hot spare after X amount of notifications. Al Hopper recently posted some more detailed examples of how this can happen.
However, lets move to a different example and say you've got six drives in a raidZ pool. What failure modes - This time I used the 'm' instead of the 'n' - allow your data to survive that can't already be taken care of with underlying raid configurations within the pool?
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss