2001-01-03-15:20:28 Dave Dykstra:
> In versions 2.3.2 and earlier, rsync had an optimization that I
> put in such that if the end of the list was --exclude '*' and
> the earlier includes didn't have any wildcards, it would skip
> the recursive traversal of the directories and just directly
> open all the included files.  Andrew Tridgell, the author of
> rsync, didn't like the fact that it's semantics wasn't exactly the
> same as without the optimization though (it didn't require the
> parent directories to be explicitly included) and he took out the
> optimization in 2.4.0.

Bummer. Wish this could be re-added, as the incompatible semantics
are precisely what are wanted in cases where some other process
generates a scattered handful of files that must be synced.

At some performance cost and additional complexity you can of course
add in all the parent directories in the --include list to get the
desired behavior of simply syncing a list of files whose names are
specified in another file, but that doesn't make that extra
processing step desireable, even if the performance loss isn't too
severe in many cases.

Even nicer, in my opinion, would be a mode where rsync could be told
to take a src dir and a dst dir as cmdline args, then simply reads
paths from stdin, and as each path is read, sync from that src file
under the src dir to the corresponding dst file under the dst dir;
repeat until eof on stdin. That'd make it easy for a process that
periodically modifies one or another file in a potentially large
tree, to simply send notifications to a persistent rsyncer that
takes care of efficiently replicating those changes over to the
other side.

I guess the fact that this isn't here now makes a case that it's
not widely needed, so it probably makes sense to keep it in mind
as a feature that can be easily done after the Big Rewrite, when
rsync mutates into a beautiful clean reuseable library with an nice
scripting language.

-Bennett

PGP signature

Reply via email to