2012-01-13 5:34, Daniel Carosone wrote:
On Fri, Jan 13, 2012 at 05:16:36AM +0400, Jim Klimov wrote:
Either I misunderstand some of the above, or I fail to
see how verification would eliminate this failure mode
(namely, as per my suggestion, replace the bad block
with a good one and have all references updated and
block-chains -> files fixed with one shot).
It doesn't update past data.
It gets treated as if there were a hash collision, and the new data is
really different despite having the same checksum, and so gets written
out instead of incrementing the existing DDT pointer. So it addresses
your ability to recover the primary filesystem by overwriting with
same data, that dedup was previously defeating.
But (yes/no?) I have to do this repair file-by-file,
either with dedup=off or dedup=verify.
Actually, that's what I properly should do if there
is such a serious error, but what if the original data
is not available so I can't fix it file-by-file, or
if there are very many errors (read, DDT references
from a number of files just under dedupditto value)
and such match-and-repair procedure is prohibitively
inconvenient, slow, whatever?
Say, previously we trusted the hash algorithm: that same
checksums mean identical blocks. With such trust the
user might want to replace the faulty block with another
one (matching the checksum) and expect ALL deduped files
that used this block to become automagically recovered.
Chances are, they actually would be correct (by external
verification).
And if we trust unverified dedup in the first place,
there is nothing wrong with such approach to repair.
It would not make possible errors worse than there were
in originally saved on-disk data (even if there were
hash collisions of really-different blocks - user had
discarded that difference long ago).
I think the user should be given an (informed) ability
to shoot himself in the foot or recover data, depending
on his luck. Anyway, people are doing it thanks to
Max Bruning's or Viktor Latushkin's posts and direct
help, or they research hardcore internals of ZFS.
We might as well play along and increase their chances
of success, even if unsupported and unguaranteed - no?
This situation with "obscured" recovery methods reminds
me of prohibited changes of firmware on cell phones:
customers are allowed to sit on a phone or drop it into
a sink, and perhaps have it replaced, but they are not
allowed to install different software. Many still do.
//Jim Klimov
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss