Re: [zfs-discuss] ZFS receive checksum mismatch

Jim Klimov Fri, 10 Jun 2011 06:03:17 -0700

2011-06-10 15:58, Darren J Moffat пишет:

As I pointed out last time this came up the NDMP service on Solaris 11Express and on the Oracle ZFS Storage Appliance uses the 'zfs send'stream as what is to be stored on the "tape".


This discussion turns interesting ;)

Just curious: how do these products work around the stream fragility
which we are discussing here - that a single-bit error can/will/should
make the whole zfs send stream invalid, even though it is probably
an error localized in a single block. This block is ultimately related
to a file (or a few files in case of dedup or snapshots/clones) whose
name "zfs recv" could report for an admin to take action such as rsync.

If it is true that unlike ZFS itself, the replication stream format has
no redundancy (even of ECC/CRC sort), how can it be used for
long-term retention "on tape"?

I understand about online transfers, somewhat. If the transfer failed,
you still have the original to retry. But backups are often needed when
the original is no longer alive, and that's why they are needed ;)

And by Murphy's law that's when this single bit strikes ;)

Is such "tape" storage only intended for reliable media such as
another ZFS or triple-redundancy tape archive with fancy robotics?
How would it cope with BER in transfers to/from such media?

Also, an argument was recently posed (when I wrote of saving
zfs send streams into files and transferring them by rsync over
slow bad links), that for most online transfers I should better use
zfs send of incremental snapshots. While I agree with this in terms
that an incremental transfer is presumably smaller and has less
chance of corruption (network failure) during transfer than a huge
initial stream, this chance of corruption is still non-zero. Simply
in case of online transfers I can detect the error and retry at low
cost (or big cost - bandwidth is not free in many parts of the world).

Going back to storing many streams (initial + increments) on tape -
if an intermediate incremental stream has a single-bit error, then
its snapshot and any which follow-up can not be received into zfs.
Even if the "broken" block is later freed and discarded (equivalent
to overwriting with a newer version of a file from a newer increment
in classic backup systems with a file being the unit of backup).

And since the total size of initial+incremental backups is likely
larger than of a single full dump, the chance of a single corruption
making your (latest) backup useless would be also higher, right?

Thanks for clarifications,
//Jim Klimov

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS receive checksum mismatch

Reply via email to