On Fri, Jan 07, 2011 at 06:39:51AM -0800, Michael DeMan wrote: > On Jan 7, 2011, at 6:13 AM, David Magda wrote: > > The other thing to note is that by default (with de-dupe disabled), ZFS > > uses Fletcher checksums to prevent data corruption. Add also the fact all > > other file systems don't have any checksums, and simply rely on the fact > > that disks have a bit error rate of (at best) 10^-16. > > Agreed - but I think it is still missing the point of what the > original poster was asking about. > > In all honesty I think the debate is a business decision - the highly > improbable vs. certainty.
The OP seemed to be concerned that SHA-256 is particularly slow, so the business decision here would involve a performance vs. error rate trade-off. Now, unless you have highly deduplicatious data, a workload with a high cache hit ratio in the ARC for DDT entries, and a fast ZIL device, I suspect that the I/O costs of dedup dominate the cost of the hash function, which means: the above business trade-off is not worthwhile as one would be trading an tiny uptick in error rates for small uptick in performance. Before you even get to where you're making such a decision you'll want to have invested in plenty of RAM, L2ARC and fast ZIL device capacity -- and for those making such that investment I suspect that the OP's trade-off won't seem worthwhile. BTW, note that verification isn't guaranteed to have a zero error rate... Imagine a) a block being written collides with a different block already in the pool, b) bit rot on disk in that colliding block such that the on-disk block matches the new block, c) on a mirrored vdev such that you might get one or another version of the block in question, randomly. Such an error requires monumentally bad luck to happen at all. Nico -- _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss