> Au contraire:  I estimate its worth quite accurately from the undetected 
> error rates reported in the CERN "Data Integrity" paper published last April 
> (first hit if you Google 'cern "data integrity"').
>
> > While I have yet to see any checksum error reported
> > by ZFS on
> > Symmetrix arrays or FC/SAS arrays with some other
> > "cheap" HW I've seen
> > many of them
>
> While one can never properly diagnose anecdotal issues off the cuff in a Web 
> forum, given CERN's experience you should probably check your configuration 
> very thoroughly for things like marginal connections:  unless you're dealing 
> with a far larger data set than CERN was, you shouldn't have seen 'many' 
> checksum errors.

Well single bit error rates may be rare in normal operation hard
drives, but from a systems perspective, data can be corrupted anywhere
between disk and CPU.  I know you're not interested in anecdotal
evidence, but I had a box that was randomly corrupting blocks during
DMA.  The errors showed up when doing a ZFS scrub and I caught the
problem in time.

Without a checksummed filesystem, I would most likely only have
discovered the problem when an important fs metadata block was
corrupted.  At this point there would be serious silent damage to user
data.  Checksumming is a safety belt that confirms that the system is
working as it was designed, and lets user process know that the data
they put down is the same as the data they get back, which can only be
a good thinng.

Like others have said for big business; as a consumer I can reasonably
comforably buy off the shelf cheap controllers and disks, and know
that should any part of the system be flaky enough to cause data
corruption the software layer will catch it which both saves money and
creates peace of mind.

James
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to