On 01/ 7/11 02:13 PM, David Magda wrote:

Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the computing world
is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
prepared to have 10^-30 (or better) for dedupe?


I think you are do not understand entirely the problem.
Lets say two different blocks A and B have the same sha256 checksum, A is already stored in a pool, B is being written. Without verify and dedup enabled B won't be written. Next time you ask for block B you will actually end-up with the block A. Now if B is relatively common in your data set you have a relatively big impact on many files because of one corrupted block (additionally from a fs point of view this is a silent data corruption). Without dedup if you get a single block corrupted silently an impact usually will be relatively limited.

Now what if block B is a meta-data block?

The point is that a potential impact of a hash collision is much bigger than a single silent data corruption to a block, not to mention that dedup or not all the other possible cases of data corruption are there anyway, adding yet another one might or might not be acceptable.


--
Robert Milkowski
http://milek.blogspot.com


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to