On Mon, Mar 6, 2017 at 2:23 AM, Kai Krakow <hurikha...@gmail.com> wrote:

> Am Tue, 14 Feb 2017 16:14:23 -0500
> schrieb "Poison BL." <poiso...@gmail.com>:
> > I actually see both sides of it... as nice as it is to have a chance
> > to recover the information from between the last backup and the death
> > of the drive, the reduced chance of corrupt data from a silently
> > failing (spinning) disk making it into backups is a bit of a good
> > balancing point for me.
>
> I've seen bordbackup giving me good protection to this. First, it
> doesn't backup files which are already in the backup. So if data
> silently changed, it won't make it into the backup. Second, it does
> incremental backups. Even if something broke and made it into the
> backup, you can eventually go back weeks or months to get back the
> file. The algorithm is very efficient. And every incremental backup is
> a full backup at the same time - so you thin out backup history by
> deleting any backup at any time (so it's not like traditional
> incremental backup which always needs the parent backup).
>
> OTOH, this means that every data block is only stored once. If silent
> data corruption is hitting here, you loose the complete history of this
> file (and maybe others using the same deduplicated block).
>
> For the numbers, I'm storing my 1.7 TB system into a 3 TB disk which is
> 2.2 TB full now. But the backup history is almost 1 year now (daily
> backups).
>
> As a sort of protection against silent data corruption, you could rsync
> borgbackup to a remote location. The differences are usually small, so
> that should be a fast operation. Maybe to some cloud storage or RAID
> protected NAS which can detect and correct silent data corruption (like
> ZFS or btrfs based systems).
>
>
> --
> Regards,
> Kai
>
> Replies to list-only preferred.
>

That's some impressive backup density... and I haven't looked into
borgbackup, but it sounds like it runs on the same principles as the
rsync+hardlink based scripts I've seen, though those will back up files
that've silently changed, since the checksums won't match any more, but
that won't blow away previous copies of the file either. I'll have to give
it a try!

As for protecting against the backup set itself getting silent corruption,
an rsync to a remote location would help, but you would have to ensure it
doesn't overwrite anything already there that may've changed, only create
new. Also, making the initial clone would take ages, I suspect, since it
would have to rebuild the hardlink set for everything (again, assuming
that's the trick borgbackup's using). One of the best options is to house
the base backup set itself on something like zfs or btrfs on a system with
ecc ram, and maintain checksums of everything on the side (crc32 would
likely suffice, but sha1's fast enough these days there's almost no excuse
not to use it). It might be possible to task tripwire to keep tabs on that
side of it, now that I consider it. While the filesystem itself in that
case is trying its best to prevent issues, there's always that slim risk
that there's a bug in the filesystem code itself that eats something, hence
the added layer of paranoia. Also, with ZFS for the base data set, you gain
in-place compression, dedup if you're feeling adventurous (not really worth
it unless you have multiple very similar backup sets for different
systems), block level checksums, redundancy across physical disks, in place
snapshots, and the ability to use zfs send/receive to do snapshot backups
of the backup set itself.

I managed to corrupt some data with zfs (w/ dedup, on gentoo) shared out
over nfs a while back on a box with way too little ram a while back
(nothing important, throwaway VM images), hence the paranoia of secondary
checksum auditing and still replicating the backup set for things that
might be important.

-- 
Poison [BLX]
Joshua M. Murphy

Reply via email to