Is -R --link-dest really hard to use, or is it me?
[Yes, I am reviving a thread from 27 months ago. Why? Because I gave up on the problem way back then and didn't move the vault. Now that I'm really trying to do this, it still doesn't make any sense... :) Matt CC'ed directly since he was the primary respondent and I have no idea if such an old thread would otherwise be noticed.] So, having tried your solution 1 and solution 2 (long pause while Matt and/or others page in their state, probably by visiting something like http://www.mail-archive.com/rsync@lists.samba.org/msg23196.html :), I can't make either one work. Here's a transcript of what happens. Clearly I'm missing something. A, B, C are hosts; 1, 2 3 are ostensibly dates; all are representations of a dirvish vault (e.g., extensively hardlinked --link-dest backups). The src tree started out with all files with "foo" in their names all hardlinked together. Ideally, the dst tree will end up likewise. If I rsync the entire tree at once, it works fine. But the real use case can't do this, because the trees are enormous and there are dozens of them. Note that each time, not every foo* in dst winds up with all the same inode as the rest. Am I just up too late? [nsn] 21:51:34 /home/blah# rsync -aviH --stats src/ dst/ sending incremental file list .d..t.. ./ cd+ a/ cd+ a/1/ cd+ a/2/ cd+ b/ cd+ b/1/ cd+ b/2/ cd+ c/ cd+ c/1/ cd+ c/2/ >f+ c/2/foofoofoo hf+ c/1/foofoo => c/2/foofoofoo hf+ b/2/b-foo2 => c/2/foofoofoo hf+ b/1/b-foo => c/2/foofoofoo hf+ a/2/foo => c/2/foofoofoo hf+ a/1/foo => c/2/foofoofoo Number of files: 16 Number of files transferred: 1 Total file size: 24 bytes Total transferred file size: 4 bytes Literal data: 4 bytes Matched data: 0 bytes File list size: 255 File list generation time: 0.001 seconds File list transfer time: 0.000 seconds Total bytes sent: 459 Total bytes received: 175 sent 459 bytes received 175 bytes 1268.00 bytes/sec total size is 24 speedup is 0.04 [nsn] 21:52:17 /home/blah# find . -ls 2241331604 drwxr-xr-x 4 root root 4096 Apr 24 21:51 . 2241331614 drwxr-xr-x 5 root root 4096 Apr 24 21:42 ./src 2241331664 drwxr-xr-x 4 root root 4096 Apr 24 21:42 ./src/b 2241331684 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./src/b/2 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/b/2/b-foo2 2241331674 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./src/b/1 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/b/1/b-foo 2241331694 drwxr-xr-x 4 root root 4096 Apr 24 21:42 ./src/c 2241331714 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./src/c/2 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/c/2/foofoofoo 2241331704 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./src/c/1 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/c/1/foofoo 2241331634 drwxr-xr-x 4 root root 4096 Apr 24 21:42 ./src/a 2241331654 drwxr-xr-x 2 root root 4096 Apr 24 21:44 ./src/a/2 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/a/2/foo 2241331644 drwxr-xr-x 2 root root 4096 Apr 24 21:43 ./src/a/1 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/a/1/foo 2241331624 drwxr-xr-x 5 root root 4096 Apr 24 21:42 ./dst 2241331754 drwxr-xr-x 4 root root 4096 Apr 24 21:42 ./dst/b 2241331804 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./dst/b/2 2241331834 -rw-r--r-- 6 root root4 Apr 24 21:43 ./dst/b/2/b-foo2 2241331794 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./dst/b/1 2241331834 -rw-r--r-- 6 root root4 Apr 24 21:43 ./dst/b/1/b-foo 2241331764 drwxr-xr-x 4 root root 4096 Apr 24 21:42 ./dst/c 2241331824 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./dst/c/2 2241331834 -rw-r--r-- 6 root root4 Apr 24 21:43 ./dst/c/2/foofoofoo 2241331814 drwxr-xr-x 2 root root 4096 Apr 24 21:49 ./dst/c/1 2241331834 -rw-r--r-- 6 root root4 Apr 24 21:43 ./dst/c/1/foofoo 2241331744 drwxr-xr-x 4 root root 4096 Apr 24 21:42 ./dst/a 2241331784 drwxr-xr-x 2 root root 4096 Apr 24 21:44 ./dst/a/2 2241331834 -rw-r--r-- 6 root root4 Apr 24 21:43 ./dst/a/2/foo 2241331774 drwxr-xr-x 2 root root 4096 Apr 24 21:43 ./dst/a/1 2241331834 -rw-r--r-- 6 root root4 Apr 24 21:43 ./dst/a/1/foo [nsn] 21:52:26 /home/blah# find . -ls | grep foo 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43 ./src/b/2/b-foo2 2241331724 -rw-r--r-- 6 root root4 Apr 24 21:43
checksum-xattr.diff [CVS update: rsync/patches]
Date: Mon, 2 Jul 2007 08:43:39 -0400 From: "Matt McCutchen" <[EMAIL PROTECTED]> > *Note that "now" for a particular disk may not be the same as time() if > the disk is remote, so network filesystems can be rather complicated. That's easy to fix: get your "now" by touching a file on the filesystem and reading the resulting mtime. Unreliable. If you sync up at the beginning of a run, and then the remote system executes a large clock step (e.g., because it's not running NTP or it's misconfigured, or it is but NTP has bailed due to excessive drift from hardware issues or a bogus driftfile (both of which I've seen*), then "now" might glitch by a second (or more), which is enough to break your idea of what "now" means---even a smaller glitch can lead to races based on whose clock ticks first. Sure, it's a low-probability event, but then, with low probability, you have some file that isn't getting updated, which can lead to all kinds of mysterious bugs, etc... Seems to me the only way around this would be to do the touch before -every- file you handle, which doubles the amount of statting going on, etc. And there are probably still timing windows there. * [One of several ways I saw this happening was a motherboard that accidentally had FSB spread-spectrum enabled, which caused the clock to run fast. NTP gave up slewing and started making larger and larger steps until it was forced to bail out. It took quite a while for this problem to be noticed ("but the machine's running NTP!"), in part because it took a while to manifest after each boot reset the clock. Then, when the BIOS setting got fixed, the bad driftfile created by NTP's valiant attempts to cope with the situation caused the clock to misbehave in the -other- direction until the NTP conf stuff was flushed and allowed to regenerate on its own with a working clock.] -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
checksum-xattr.diff [CVS update: rsync/patches]
Date: Mon, 2 Jul 2007 21:18:57 -0400 From: "Matt McCutchen" <[EMAIL PROTECTED]> The technique Wayne and I are discussing assumes only that the clock on *each side* never steps backwards. It compares the current mtime and ctime on each side to the previous mtime and ctime on that side as recorded in the cache. Clock synchronization between the two sides is irrelevant. Okay, but that's still unreliable. Backward clock steps -can- happen; only in Multics is it (mostly) impossible (because a backwards step would destroy the filesystem). But since rsync probably doesn't run on Multics... :) Consider a much more likely scenario---an NFS server reboots. It's perfectly okay for it to do this at any time, and the client NFS will recover, without informing rsync. It's quite possible for large clock steps to happen upon reboot, especially for machines that might run ntpdate on boot but not ntpd during normal operation. In that case, you've got about a 50% chance that there might be a backwards clock step, and this could conceivably happen between any two NFS requests... It is true that if either side's clock steps backwards, that side could be fooled into thinking a file hasn't changed from the cache when it really has. There's very little we can do about that except tell the sysadmin to delete all the caches when he/she sets the clock backwards. > Seems to me the only way around this would be to do the touch before > -every- file you handle, which doubles the amount of statting going > on, etc. And there are probably still timing windows there. I don't understand this concern. If you'd like a more formal proof that the technique never misses a modification assuming each side's clock runs forward (actually, just each filesystem's clock), I would be happy to provide one. Working out such a proof would be interesting (because it might reveal a flaw nobody's even thought about yet), but the first order of business might be figuring out how to reliably detect a backwards step, or how to make sure that users understand they might be silently screwed if one happens. I understand that it's a fairly low probability, and depends on some questionable configurations, but rsync is well-known to be both reliable and deterministic. I'd hate for something like this to start chipping away at that reputation, even if we -are- talking about a corner case in a performance optimization that might not get invoked all that much. Not that my opinion in this matters a whit to begin with; I just thought I'd point out a possible screw case before it actually screwed someone. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
checksum-xattr.diff [CVS update: rsync/patches]
Date: Mon, 2 Jul 2007 21:18:57 -0400 From: "Matt McCutchen" <[EMAIL PROTECTED]> The technique Wayne and I are discussing assumes only that the clock on *each side* never steps backwards. Um, and note, btw, that the pathological FSB-spread-spectrum/NTP interaction I mentioned in my first message was causing a whole -bunch- of backwards steps, over several months, until it was noticed. I don't recall their magnitude, but I think it was a backwards step of at least a second every few tens of minutes, until after quite some time NTP simply exceeded its tolerance and punted, whereupon the clock ran away. But since it was -almost- holding it together, for days or weeks at a time... And the machine was an NFS server. So in fact this scheme would have been leading to a whole bunch of sporadic "why was this cache inaccurate?" failures for a long time if rsync had been using this strategy and someone had been using it against that server. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
--hard-links performance
Date: Wed, 11 Jul 2007 01:26:18 -0400 From: "George Georgalis" <[EMAIL PROTECTED]> the program is http://www.ka9q.net/code/dupmerge/ there are 200 lines of well commented C; however there may be a bug which allocates too much memory (one block per file); so my application runs out. :\ If you (anyone) can work it out and/or bring it into rsync as a new feature, that would be great. Please keep the author and myself in the loop! Do a search for "faster-dupemerge"; you'll find mentions of it in the dirvish archives, where I describe how I routinely use it to hardlink together filesystems in the half-terabyte-and-above range without problems on machines that are fairly low-end these days (a gig of RAM, a gig or so of swap, very little of which actually gets used by the merge). Dirvish uses -H in rsync to do most of the heavy lifting, but large movements of files from one directory to another between backups won't be caught by rsync*. So I follow dirvish runs with a run of faster-dupemerge across the last two snapshots and across every machine being backed up (e.g., one single run that includes two snapshots per backed-up machine); that not only catches file movements within a single machine, but also links together backup files -across- machines, which is quite useful when you have several machines which share a lot of similar files (e.g., the files in the distribution you're running), or if a file moves from one machine to another, etc, and saves considerable space on the backup host. [You can also trade off speed for space, e.g., since the return on hardlinking zillions of small files is relatively low compared to a few large ones, you can also specify "only handle files above 100K" or whatever (or anything else you'd like as an argument to "find") and thus considerably speed up the run while not losing much in the way of space savings; I believe I gave some typical figures in one my posts to the dirvish lists. Also, since faster-dupemerge starts off by sorting the results of the "find" by size, you can manually abort it at any point and it will have merged the largest files first.] http://www.furryterror.org/~zblaxell/dupemerge/dupemerge.html is the canonical download site, and mentions various other approaches and their problems. (Note that workloads such as mine will also require at least a gig of space in some temporary directory that's used by the sort program; fortunately, you can specify on the command line where that temp directory will be, and it's less than 0.2% of the total storage of the filesytem being handled.) * [Since even fuzzy-match only looks in the current directory, I believe, unless later versions can be told to look elsewhere as well and I've somehow missed that---if I -have- missed that, it'd be a nice addition to be able to specify extra directories (and/or trees) in which fuzzy-match should look, although in the limit that might require a great deal of temporary space and run slowly.] -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Rsync shouldn't display a meaningless speedup on a dry run
> Date: Mon, 05 Nov 2007 13:17:32 -0500 > From: Matt McCutchen <[EMAIL PROTECTED]> > I think rsync should omit the speedup on a dry run. The attached patch > makes it do so. I worry about those trying to write things that parse rsync's output; if -n changes the output format, such things will have to be tested on live data. Is it possible (e.g., without ridiculous amounts of code-massaging) to have -n output the speedup (or some more-reasonable estimate) anyway? Sure, all kinds of differences haven't been computed, but... Or maybe just have it report a speedup of 1.00 instead? Still misleading, but it preserves the output format and is trivial to write (but still, alas, confusing for the user, so this doesn't fill me with glee). Or we can just assume that such parsers might be looking at the file list, but it's dubious that applications exist that care about the speedup data and hence would be throwing away such lines anyway (and would not break if it doesn't appear). -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Rsync shouldn't display a meaningless speedup on a dry run
Date: Tue, 06 Nov 2007 23:18:08 -0500 From: Matt McCutchen <[EMAIL PROTECTED]> On Tue, 2007-11-06 at 22:22 -0500, [EMAIL PROTECTED] wrote: > I worry about those trying to write things that parse rsync's output; > if -n changes the output format, such things will have to be tested on > live data. No, just run rsync's output through a sed script that adds the desired speedup to the last line. That changes the test setup quite a lot with and without -n. > Is it possible (e.g., without ridiculous amounts of code-massaging) to > have -n output the speedup (or some more-reasonable estimate) anyway? > Sure, all kinds of differences haven't been computed, but... Rsync could estimate an upper bound on how much a real run might send by adding the size of the data that wasn't transferred (regular file data and abbreviated xattrs) to the amount the dry run sent, but I'm not sure the resulting value would be useful enough to make this worthwhile. I could go either way. > Or maybe > just have it report a speedup of 1.00 instead? Still misleading, but > it preserves the output format and is trivial to write (but still, > alas, confusing for the user, so this doesn't fill me with glee). That lie would be no improvement over the current one. Then how about this: If your patch winds up in rsync, it requires a patch to the manpage entry for -n that says, essentially, "You can't trust the actual information emitted when running with -n to match what gets emitted if you haven't specified -n. Therefore, if you're writing things that parse rsync's output, you must ensure that your script works with and without -n. Here is an itemization of those things that might be different in its output with and without -n: (a) With -n, the speedup line will be omitted. (b) ?" Etc. At least that way, someone writing such a tool will be warned without having to find out the hard way. (I don't write such tools, but I've certainly seen some, and read some chatter about them on this list.) -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
remote logging non-daemon mode
> Date: Wed, 5 Dec 2007 23:21:27 -0500 > From: "Doug Lochart" <[EMAIL PROTECTED]> > Each module needs to be protected from the others so if a user logs in with > their credentials they should not have access to any other module. It > would take a user knowing the name of another client to affect the security > breach. I admit I am no whiz at securing the rsync server. Once we had it > setup to run in daemon mode we assumed the ssh tunnels would provide all > that we need. We over looked this one issue however. Are users supposed to be running any arbitrary rsync command they like when they connect, or is there a canonical one for doing the backup? If the latter, can you use ssh's "forced command" mode, with a different command associated with each user? Hmm. I just did a search and found this, from two months ago: http://www.mail-archive.com/rsync@lists.samba.org/msg19657.html Relevant? -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Is -R --link-dest really hard to use, or is it me?
I've got a problem for which the combination of -R and --link-dest doesn't seem to be quite enough---and I may have discovered a few small bugs as well; test cases are below. [And if someone has a scheme for doing this that doesn't involve rsync at all, but works okay, I'm all ears as well---I'm not the first with this problem.] Here's my problem: I unfortunately need to move a large dirvish vault. This is a directory tree consisting of -many- hardlinked files, which means that moving it in pieces will copy many times more data than is actually there, but trying to move the entire thing in one shot consumes more RAM than is available. [rsync on the toplevel dir blew up almost immediately, as I expected. cp -a was consuming at least 130meg per snapshot and therefore looked likely to consume at least 10G of RAM to finish; it's actually possible for other reasons it might have been closer to 20G. It thus got slower and slower as it became more and more page-bound and I eventually got tired of it thrashing itself to death; ETA might have been a few weeks at that rate. I can't just move the underlying blocks (e.g., copy the partition as a partition) because the whole reason I'm moving this filesystem in the first place is because it has errors that fsck is having trouble fixing---bug or bad hardware isn't established yet. And I don't know if dump/restore works well on ext3 filesystems, is well-tested these days, will work for ext4 when I finally migrate to that, or produces good data if the filesystem I'm starting with has errors that fsck complains about (or if it, too, will consume enormous amounts of RAM, but I'm assuming it's not trying to cache every inode it dumps, so maybe that might work if I trusted it---opinions anyone?)] So---rsync to the rescue, except not. A normal dirvish backup just uses --link-dest against the previous host/date combo, and works fine. I could copy the entire set of snapshots to a new filesystem the same way, EXCEPT for a problem: I took pains to hardlink files -across- hosts' backups that were also the same, so I didn't have a zillion copies of the same files that are all shared by most releases and any linux anyway. E.g., in this sort of arrangement: hostA/20080101 hostB/20080101 ... hostF/20080101 ... hostA/20080102 hostB/20080102 ... hostF/20080102 ... dirvish (well, rsync) itself hardlinked files between hostA/20080101 and hostA/20080102 on successive runs, and then -I- ran a tool (faster-dupemerge) that hardlinked identical files between hostA/20080101 and hostB/20080101 (etc). Once this is done across the very first set of dumps (e.g., 20080101 in this example), then even though rsync is doing --link-dest only from hostA to hostA on successive runs, everything stays hardlinked together across hosts because the same inode is being reused everywhere. (I also run faster-dupemerge across all hosts for the most-recent pair of backups to catch files that have been -copied or moved-, either from one dir to another on the same host, or across hosts. Works great.) Unfortunately, I can't get rsync to do the right thing when I'm trying to copy this structure. What I'd -like- to do is to take all of hostA..hostF---for a single date---and copy them all at once, using --link-dest to point back at the previous date's set of hosts as the basis. But because of the way the directories are structured, I need to use -R so I get the same structure recreated, and that seems to break --link-dest, unless there's some syntax issue in what I'm doing. Small test case: Imagine that "src" is my original filesystem, and "dst" is where I'm trying to move things. (Here, they share a superior directory, but of course in real life they're different filesystems.) "foo" is my test file; there are multiple copies of it in src that are all hardlinked together. I've already done the push of the first vault's contents from src to dst, so --link-dest has something to work with; note that the inode numbers for foo in src and dst are different (since, again, in real life, they're on different filesystems), but that all copies of foo in either src or dst (so far) share the same inode. The A, B, and C directories correspond to individual hosts. 18:45:42 ~/H$ find . -name "foo" -ls 844204 -rw-r--r-- 2 blahblah 4 Jan 11 18:43 ./src/a/1/foo 844204 -rw-r--r-- 2 blahblah 4 Jan 11 18:43 ./src/a/2/foo 844264 -rw-r--r-- 1 blahblah 4 Jan 11 18:43 ./dst/a/1/foo 18:45:46 ~/H$ ~/rsync-3.0.5/rsync -aviH --link-dest=../1 src/a/2/ dst/a/2/ sending incremental file list created directory dst/a/2 cd..t.. ./ sent 61 bytes received 15 bytes 152.00 bytes/sec total size is 4 speedup is 0.05 18:46:11 ~/H$ find . -name "foo" -ls 844204 -rw-r--r-- 2 blahblah 4 Jan 11 18:43 ./src/a/1/foo 844204 -rw-r--r-- 2 blahblah 4 Jan 11 18:43 ./src/a/2/foo 844264 -rw-r--r-- 2 blahblah
Is -R --link-dest really hard to use, or is it me?
> Date: Sun, 25 Jan 2009 01:02:15 -0500 > From: Matt McCutchen > I regret the slow response. I was interested in your problem, but I > knew it would take me a while to respond thoughtfully, so I put the > message aside and didn't get back to it until now. I hope this is still > useful. Yes, it is. Thanks. [The immediate need to move the filesystem is gone because the underlying hardware problem has been solved, but eventually I'm going to want to migrate this ext3 to ext4, and the problem will recur at that point. Besides, I'm not the only one who might need to move such extensively-hardlinked filesystems.] > > Okay, so the above shows that --link-dest without -R appears to work, BUT--- > > how come there was no actual output from rsync when it created dst/a/2/foo? > > Correct side-effect (foo created, with correct inode), but incorrect output. > The lack of output here is by design. That's not to say that I think > the design is a good one. I have to confess that I don't, either. (...but see below.) > [ . . . ] > However, the more recently added --copy-dest and --link-dest: > [ . . . ] > have the IMHO more useful interpretation that the basis dir is to be > used as an optimization (of network traffic and/or destination disk > usage), without affecting either the itemization or the final contents > of the destination. I entered an enhancement request for this to be > supported properly: > https://bugzilla.samba.org/show_bug.cgi?id=5645 I see where you're going with that; I assume that such an enhancement would, as fallout, cause itemization of created hardlinks when using a --dest arg. (Right now, they're itemized in a "normal" run with -H but without a --dest, but don't appear if --dest is added, which looks to someone who hasn't followed the entire history like a bug---and makes the output less useful, too.) ...though on the other hand, would this dramatically clutter up the output of a "normal" --link-dest where, typically, one is looking to see which -new- files got transferred as opposed to seeing the creation of a zillion files that were in the basis dirs? (Since you seem to advocate two different options, I guess that would allow users to decide either way.) > [ . . . ] > Right. To recap the problem: In order to transfer both b/2/ and c/2/ to > the proper places under dst/ in a single run, you needed to include the > "b/2/" and "c/2/" path information in the file list by using -R. But > consequently, rsync is going to look for b/2/foo and c/2/foo under > whatever --link-dest dir you specify, and there's no directory on the > destination side that contains files at those paths (yet). So you're saying that there appears to be no way to tell rsync what I want to do in this case---I haven't missed something, and it's either a limitation or a design goal that it works this way. Correct? [Err, except that perhaps you have a solution below; it's just that -R is pretty much useless with any of the --*-dests.] > Tilde expansion is the shell's job. Right, I realized what was going on just after I sent the mail. (I was concentrating on the real problem at hand, of course, and missed that I'd put an = in there, defeating the shell; attributing tilde expansion to anything but the shell must have meant I'd been awake too long. :) > I think using a separate rsync run for each hostX/DATE dir is the way to > go since it's easy to specify an appropriate --link-dest dir, or more > than one. With this approach, you don't need -H unless you want to > preserve hard links among a single host's files on a single day. I do need -H for that reason (there are many crosslinked files in any individual source host---not just in the dirvish vault), but unfortunately doing a separate run for each hostX/DATE combination isn't enough either, which is how I got into this problem---the reason is that there are crosslinks -across- the hosts that I -also- want to preserve. Although perhaps your suggestion below is the solution. (How did this happen? Because after each date's backups, I run faster-dupemerge across all hosts (and across the previous date's run), all at once, e.g. 6 hosts times 2 dates, in my example. This merges files that are the same across hosts [distribution-related stuff, mostly] and also catches files that moved across directories or across hosts---oh, whoops, I just realized I mentioned this the first time, but it bears repeating 'cause it's why this is an unusual case. Not having rsync catch this when I'm copying this giant hierarchy to a new filesystem would undo the work unless I ran f-d on the copy as it was being created, which would increase the time to move everything by quite a lot.) > In recent months, several rsnapshot users have posted about migration > problems similar to yours but one-dimensional (dates only), and I wrote >
Malformed Address and Private IP issue
Date: Wed, 8 Mar 2006 17:15:36 -0800 From: Wayne Davison <[EMAIL PROTECTED]> On Wed, Mar 08, 2006 at 01:48:37PM -0800, Jonathan Chen wrote: > 2006/03/08 11:25:12 [16976] malformed address localhost.localdomain That can't be 2.6.6 because 2.6.6 doesn't have an error message of that format. In 2.6.6, the old "malformed address" error now outputs as "error matching address" and includes the gai_strerror() text after a colon. A new error in 2.6.6 that does include the string "malformed address" would have also included the gai_strerror() text after a colon. Thus, that's still the old rsync running. Perhaps it didn't really get stopped? Or perhaps it is running via inetd? Given how often rsync versions change and how much functionality goes into each new one (yay!), I wonder if it might not be such a bad idea to have the rsync version embedded in every error message? With most programs, it's likely that the user knows at least something about the version they're running, but since rsync is almost always run with one of the instantiations remote, it might make debugging easier if the message was explicit... -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Data Encryption
Date: Mon, 12 Jun 2006 14:18:00 -0400 From: Matt McCutchen <[EMAIL PROTECTED]> On Mon, 2006-06-12 at 10:58 -0700, Chuck Wolber wrote: > On Mon, 12 Jun 2006, Brad Farrell wrote: > > > Is there a way with rsync to encrypt data at the source before > > transmitting? Not talking about the actually transmission, but the data > > itself. I've got a few department heads that want their data secured > > before it leaves their computer so that no one in the office can access > > the data except for them. Then again, the data is saved decrypted on the destination disk. Maybe the files should be individually encrypted with a symmetric algorithm on the source before transmission. This could be done with either a script or the --source-filter option provided by an experimental rsync patch. Note that typical encryption algorithms prevent incremental transfer from identifying unchanged portions of a file; rsyncrypto does not but I'm not sure I trust its security. The right solution is probably to run an encrypted filesystem on the machine that holds the backups, and of course to use ssh getting the files there. That way, rsync's incremental algorithm is actually of some use. While you're at it, put an encrypted filesystem on the machines the data is coming -from-, too, especially if they're laptops. Machines do get stolen. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Data Encryption
Date: Mon, 12 Jun 2006 18:01:34 -0400 From: Matt McCutchen <[EMAIL PROTECTED]> On Mon, 2006-06-12 at 17:51 -0400, [EMAIL PROTECTED] wrote: > The right solution is probably to run an encrypted filesystem on the > machine that holds the backups, and of course to use ssh getting the > files there. That isn't enough if the department heads don't trust the backup machine to transfer the data to the encrypted volume without peeking at it in the process. True. In that case, they have no choice but to encrypt locally ---or pick a different backup organization that they -do- trust. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Problem with shared xls file. Could it be blamed on rsync?
Date: Fri, 16 Mar 2007 02:30:33 -0700 (PDT) From: syncro <[EMAIL PROTECTED]> Thanks alot! That's what I wanted to hear ;) We want to have an always-up-to-date-copy thus rsync every minute and not just at night. However my preventive measure will be a forbiddance of sharing xls files or the like. Rather than forbidding sharing, maybe you could ask rsync (via files-from and a filter or something) to only back up files haven't been modified in the last 10 minutes? I don't know exactly when Windows might update the file's timestamp vs when data starts getting written to it---and there will always be a tiny timing race anyway since the scan of the filesystem and the start of the update aren't simultaneous---but it might be the file gets backed up every few minutes when people aren't actively working on it. The other solution might be to have Windows copy the file to a temporary location (since Windows might respect its own locks), and then back up the temporary copy. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html