> Au contraire: I estimate its worth quite accurately from the undetected > error rates reported in the CERN "Data Integrity" paper published last April > (first hit if you Google 'cern "data integrity"'). > > > While I have yet to see any checksum error reported > > by ZFS on > > Symmetrix arrays or FC/SAS arrays with some other > > "cheap" HW I've seen > > many of them > > While one can never properly diagnose anecdotal issues off the cuff in a Web > forum, given CERN's experience you should probably check your configuration > very thoroughly for things like marginal connections: unless you're dealing > with a far larger data set than CERN was, you shouldn't have seen 'many' > checksum errors.
Well single bit error rates may be rare in normal operation hard drives, but from a systems perspective, data can be corrupted anywhere between disk and CPU. I know you're not interested in anecdotal evidence, but I had a box that was randomly corrupting blocks during DMA. The errors showed up when doing a ZFS scrub and I caught the problem in time. Without a checksummed filesystem, I would most likely only have discovered the problem when an important fs metadata block was corrupted. At this point there would be serious silent damage to user data. Checksumming is a safety belt that confirms that the system is working as it was designed, and lets user process know that the data they put down is the same as the data they get back, which can only be a good thinng. Like others have said for big business; as a consumer I can reasonably comforably buy off the shelf cheap controllers and disks, and know that should any part of the system be flaky enough to cause data corruption the software layer will catch it which both saves money and creates peace of mind. James _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss