On Fri, Jan 07, 2011 at 06:39:51AM -0800, Michael DeMan wrote:
> On Jan 7, 2011, at 6:13 AM, David Magda wrote:
> > The other thing to note is that by default (with de-dupe disabled), ZFS
> > uses Fletcher checksums to prevent data corruption. Add also the fact all
> > other file systems don't have any checksums, and simply rely on the fact
> > that disks have a bit error rate of (at best) 10^-16.
> 
> Agreed - but I think it is still missing the point of what the
> original poster was asking about.
> 
> In all honesty I think the debate is a business decision - the highly
> improbable vs. certainty.

The OP seemed to be concerned that SHA-256 is particularly slow, so the
business decision here would involve a performance vs. error rate
trade-off.

Now, unless you have highly deduplicatious data, a workload with a high
cache hit ratio in the ARC for DDT entries, and a fast ZIL device, I
suspect that the I/O costs of dedup dominate the cost of the hash
function, which means: the above business trade-off is not worthwhile as
one would be trading an tiny uptick in error rates for small uptick in
performance.  Before you even get to where you're making such a decision
you'll want to have invested in plenty of RAM, L2ARC and fast ZIL device
capacity -- and for those making such that investment I suspect that the
OP's trade-off won't seem worthwhile.

BTW, note that verification isn't guaranteed to have a zero error
rate...  Imagine a) a block being written collides with a different
block already in the pool, b) bit rot on disk in that colliding block
such that the on-disk block matches the new block, c) on a mirrored vdev
such that you might get one or another version of the block in question,
randomly.  Such an error requires monumentally bad luck to happen at
all.

Nico
-- 
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to