On 01/ 7/11 02:13 PM, David Magda wrote:
Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the computing world
is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
prepared to have 10^-30 (or better) for dedupe?
I think you are do not understand entirely the problem.
Lets say two different blocks A and B have the same sha256 checksum, A
is already stored in a pool, B is being written. Without verify and
dedup enabled B won't be written. Next time you ask for block B you will
actually end-up with the block A. Now if B is relatively common in your
data set you have a relatively big impact on many files because of one
corrupted block (additionally from a fs point of view this is a silent
data corruption). Without dedup if you get a single block corrupted
silently an impact usually will be relatively limited.
Now what if block B is a meta-data block?
The point is that a potential impact of a hash collision is much bigger
than a single silent data corruption to a block, not to mention that
dedup or not all the other possible cases of data corruption are there
anyway, adding yet another one might or might not be acceptable.
--
Robert Milkowski
http://milek.blogspot.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss