Both sites have DSL accounts provided by Arachnet.
At present the files being backed up don't all all to be backed up, but OTOH we wish to backup lots more files that aren't being backed up now.
First, we create a local backup on our office machine which happens to be called "mail." We have this directory structure:
drwxr-xr-x 20 root 4096 May 17 23:06 20040517-1500-mon
drwxr-xr-x 20 root 4096 May 18 23:06 20040518-1500-tue
drwxr-xr-x 20 root 4096 May 19 23:09 20040519-1500-wed
drwxr-xr-x 20 root 4096 May 20 23:09 20040520-1500-thu
drwxr-xr-x 20 root 4096 May 21 23:09 20040521-1500-fri
drwxr-xr-x 20 root 4096 May 22 23:10 20040522-1500-sat
drwxr-xr-x 20 root 4096 May 23 23:09 20040523-1500-sun
drwxr-xr-x 20 root 4096 May 24 23:10 20040524-1500-mon
drwxr-xr-x 20 root 4096 May 25 23:10 20040525-1500-tue
drwxr-xr-x 20 root 4096 May 26 23:10 20040526-1500-wed
drwxr-xr-x 20 root 4096 May 27 23:10 20040527-1500-thu
drwxr-xr-x 20 root 4096 May 28 23:11 20040528-1500-fri
drwxr-xr-x 20 root 4096 May 29 23:11 20040529-1500-sat
drwxr-xr-x 20 root 4096 May 30 23:10 20040530-1500-sun
drwxr-xr-x 20 root 4096 May 31 23:11 20040531-1500-mon
drwxr-xr-x 3 root 4096 Jun 1 14:10 20040601-0603-tue
drwxr-xr-x 3 root 4096 Jun 1 23:07 20040601-1500-tue
drwxr-xr-x 3 root 4096 Jun 2 07:42 20040601-2323-tue
drwxr-xr-x 3 root 4096 Jun 2 23:07 20040602-1500-wed
drwxr-xr-x 3 root 4096 Jun 3 14:04 20040603-0555-thu
drwxr-xr-x 3 root 4096 Jun 3 23:06 20040603-1500-thu
drwxr-xr-x 3 root 4096 Jun 4 23:07 20040604-1500-fri
drwxr-xr-x 3 root 4096 Jun 5 23:08 20040605-1500-sat
drwxr-xr-x 3 root 4096 Jun 7 14:19 20040607-0610-mon
drwxr-xr-x 3 root 4096 Jun 8 05:01 20040607-2054-mon
drwxr-xr-x 3 root 4096 Jun 8 05:35 20040607-2128-mon
drwxr-xr-x 20 root 4096 Jun 1 14:06 latest
The timestamps in the directory names are UTC times.
We maintain the contents of latest thus:
+ rsync --recursive --links --hard-links --perms --owner --group --devices --times --sparse --one-file-system --rsh=/usr/bin/ssh --delete --delete-excluded --delete-after --max-delete=80 --relative --stats --numeric-ids --exclude-from=/etc/local/backup/system-backup.excludes /boot/ / /home/ /var/ /var/local/backups/office//latest
and create the backup-du-jour:
+ cp -rl /var/local/backups/office//latest /var/local/backups/office//20040607-2128-mon
That part works well, and the rsync part generally takes about seven minutes.
To copy office to home we try this:
+ rsync --recursive --links --hard-links --perms --owner --group --devices --times --sparse --one-file-system --rsh=/usr/bin/ssh --delete --delete-excluded --delete-after --max-delete=80 --relative --stats --numeric-ids /var/local/backups 192.168.0.1:/var/local/backups/
Prior to this run that is in progress, we used home's external host name. I've created a VPN between the two sites (for other reasons) using OpenVPN: all the problems we've had so far occurred with, we'll say, the hostname is "home.arach.net.au" as that's the default way Arachnet assign hostnames.
I'm hoping that OpenVPN will provide a more robust recovery from network problems.
Problems we've had include
1. ADSL connexion at one end ot the other dropping for a while. rsync doesn't notice and mostly hangs. I have seen rsync at home still running but with no relevant files open.
2. rsync uses an enormous amount of virtual memory with the result the Linux kernel lashes out at lots of processes, mostly innocent, until it lucks on rsync. This can cause rsync to terminate without a useful message.
2a. Sometimes the rsync that does this is at home.
I've alleviated this at office by allocating an unreasonable amount of swap: unreasonable because if it gets used, performance will be truly dreadful.
3. rsync does not detect when its partner has vanished. I don't understand why this should be so: it seems to me that, at office, it should be able to detect by the fact {r,s}sh has terminated or by timeout, and at home by timeout.
3a. It'd like to see rsync have the ability to retry in the case it's initiated the transfer. It can take some time to collect together the information as to what needs to be done: if I try in its wrapper script, then this has to be redone whereas, I surmise, rsync doing the retry would not need to.
4. I've already mentioned this, but as I've had no feedback I'll try again.
As you can see from the above, the source directories for the transfer from office to home are chock-full of hard links. As best I can tell, rsync is transferring each copy fresh instead of recognising the hard link before the transfer and getting the destination rsync to make a new hard link. It is so that it _can_ do this that I present the backup directory as a whole and not the individual day's backup. That, and I have hopes that today's unfinished work will be done tomorrow.
This approach seems so far to be problematic, and I am wondering whether I should instead be doing one of these:
A. Create a filesystem image with
dd if=/dev/zero of=backup .... # of suitable size
mke2fs backup
then mount -o loop, and put my backups inside that, and then use rsync to sync that offsite.
Presumably this will use much less virtual memory. The question is how quickly it would sync the two images. I imagine my problem with hard links will vanish.
B. Create a filesystem image as above Use jigdo to keep the images in sync.
C. Use md5sum and some home-grown scripts to decide what to transfer.
I'm not keen on C. as basically it's implementing what I think rsync should be doing.
btw the latest directory contains 1.5 Gbytes of data. The system is still calculating that today's backup contains 1.5 Gbytes, so it seems the startup costs are considerable.
-- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html