On 09/06/14 01:55, Phil Stracchino wrote: > On 06/08/14 06:09, Steven Haigh wrote: >> I do believe this is one of the biggest shortcomings of Bacula... >> The fact it is job based vs file based removes a lot of >> flexibility. > >> If I understand things properly, for a VirtualFull will: 1) Require >> all volumes as stated below, and; 2) Require enough space to write >> the entire backup out again; and 3) is unable to keep a copy of a >> file forever if it is never changed. > >> Instead, after the purge date, the file is deleted and >> retransferred - unless it is done by a VirtualFull - which still >> has the problems of #1 and #2 above. > > I'm not sure I understand your objections here. Given arbitrarily > large disk space, you can set your job and file retention to fifty > years if you so choose. Bacula will keep your initial full backup as > long as you tell it to. But you can't get by on just an initial full > backup and ten years worth of daily incrementals. Doing a restore > would require searching thousands of jobs to build the file tree. > > Name me one backup solution which does NOT have to re-transfer a file > if you delete the original backup.
I think I misled with the terminology here. Coming from the TSM world, everything is file based. The database tracks individual files in volumes that can be from any node (client). If the file is still present, it never deletes the file from the backup. If the file is updated, it marks it as a historical version and keeps it for X revisions or Y days depending on config. As this is done *per file*, it means files never 'expire' as such unless they are deleted and beyond the deleted file retention. It can do this effectively because TSM does things *per file* and not per job. This means with a few volumes (I had 40Gb always on, 3 x 1Tb eSATA drives) do incrementals forever and still be guaranteed a consistent result on restore - as long as you had all the volumes. To handle volumes that have a ton of deleted but a few current files, TSM does a reclamation where it moves those files to another storage pool and then migrates them back to a more recent volume (in the case of tapes) or continues with random access forever in the case of File volumes. My setup with TSM was that clients transfer all their incremental data to a 40Gb 'temporary' pool if you like - then when that starts to get full it was migrated to eSATA storage - again - all at the file level - not job level. The eSATA drives can then be taken offline again and not be required until either another migration or a restore requires them. > VirtualFull jobs allow you to keep replacing that original Full backup > without having to re-transfer all of the files from the client again. > Of *xcourse*you have to re-copy them from the old job; but then the > old job can be purged. Yes, you need the space for both to create the > VirtualFull; but you don't have to keep both around after it's completed. Yeah, this is more or less what I understood. Not a simple task - but workable. Either way, I think I'll have to rethink how I do things. > If you want to never have to do any job maintenance, never re-copy a > file that hasn't changed, never use any extra disk space, etc, etc, > perhaps you should be just using rsync? Of course, de-duplicating and > creating hard links would become your problem, and then you have to be > careful not to update all copies of a hard-linked file when one > original updates... I'm having a bit of a tinker with just that now... I've attached the script as I've currently written it for interest. The script runs on a VM and stores all data in /backups/$HOST/[0-7]/ on a compressed btrfs volume. I think then I can also use the eSATA drives to rsync from this host to the eSATA drive and have all 7 days revision of the system. You could easily extend this number of days with very little extra disk space requirement - as everything is hard linked, only the '0' folder has all files - 1-7 have 'changes': # du -hs * 7.3G 0 91M 1 The advantage though would mean that EVERY directory can be used as a 'restore point' to give an exact copy of the system at that point in time. You could probably do this weekly instead in some cases and have several months worth of changes for a small increase in space requirements. That being said - *my* requirements for backups are rather minimal. The entire setup on my part is less than 200Gb when stored this way. -- Steven Haigh Email: net...@crc.id.au Web: http://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 Fax: (03) 8338 0299
#!/bin/bash function log { logger "rsync-backup: $1" } function rotate_backup() { local hostname="$1" mkdir -p /backups/$hostname/0 rm -fR /backups/$hostname.7 mv /backups/$hostname/6 /backups/$hostname/7 mv /backups/$hostname/5 /backups/$hostname/6 mv /backups/$hostname/4 /backups/$hostname/5 mv /backups/$hostname/3 /backups/$hostname/4 mv /backups/$hostname/2 /backups/$hostname/3 mv /backups/$hostname/1 /backups/$hostname/2 cp -al /backups/$hostname/0 /backups/$hostname/1 } ## Start the email log... cp /root/backup-complete.txt /tmp/email.$$ log "$1: System backup drive detected..." ## Run through the local systems with no compression etc... for HOST in list of local systems; do log "$1: Backing up $HOST..." echo "`date`: Started $HOST backup..." >> /tmp/email.$$ rotate_backup "$HOST" rsync -axP --delete-before $HOST:/ /backups/$HOST/0 log "$1: Backing up $HOST complete..." echo "`date`: Complete $HOST backup..." >> /tmp/email.$$ done ## Do the remote hosts with max compression... for HOST in list of remote systems; do log "$1: Backing up $HOST..." echo "`date`: Started $HOST backup..." >> /tmp/email.$$ rotate_backup "$HOST" rsync -zaxP --compress-level=9 --delete-before $HOST:/ /backups/$HOST/0 log "$1: Backing up $HOST complete..." echo "`date`: Complete $HOST backup..." >> /tmp/email.$$ done echo "Stats:" >> /tmp/email.$$ df -h | grep $1 >> /tmp/email.$$ /root/bin/mail-body /tmp/email.$$ net...@crc.id.au "Xenhost: Backup Complete!" rm -f /tmp/email.$$
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users