Is there any chance this can be added into the distribution as it sounds really nifty.
Another suggestion unless I have read the following - would it be useful to have a command option in rsync to generate the file list by doing the "find" and outputting into a standard format? (As this would make it less OS specific or kludgy?) Cheers, Lachlan. At 16:06 19/11/01 -0500, you wrote: >I have attached a patch that adds 4 options to rsync that have helped >me to speed up my mirroring. I hope this is useful to someone else, >but I fear that my relative inexperience with rsync has caused me to >miss a way to do what I want without having to patch the code. So please >let me know if I'm all wet. > >Here's my story: I have a large filesystem (around 20 gigabytes of data) >that I'm mirroring over a T1 link to a backup site. Each night, >about 600 megabytes of data needs to be transferred to the backup site. >Much of this data has been appended to the end of various existing files, >so a tool like rsync that sends partial updates instead of the whole >file is appropriate. > >Normally, one could just use rsync with the --recursive and --delete features >to do this. However, this takes a lot more time than necessary, basically >because rsync spends a lot of time walking through the directory tree >(which contains over 300,000 files). > >One can speed this up by caching a listing of the directory tree. I maintain >an additional state file at the backup site that contains a listing >of the state of the tree after the last backup operation. This is essentially >equivalent to saving the output of "find . -ls" in a file. > >Then, the next night, one generates the updated directory tree for the source >file system and does a diff with the directory listing on the backup file >system to find out what has changed. This seems to be much faster than >using rsync's recursive and delete features. > >I have my own script and programs to delete any files that have been removed, >and then I just need to update the files that have been added or changed. >One could use cpio for this, but it's too slow when only partial files >have changed. > >So I added the following options to rsync: > > --source-list SRC arg will be a (local) file name containing a list of files, or - to read file names from stdin > --null used with --source-list to indicate that the file names will be separated by null (zero) bytes instead of linefeed characters; useful with gfind -print0 > --send-dirs send directory entries even though not in recursive mode > --no-implicit-dirs do not send implicit directories (parents of the file being sent) > >The --source-list option allows me to supply an explicit list of filenames >to transport without using the --recursive feature and without playing >around with include and exclude files. I'm not really clear on whether >the include and exclude files could have gotten me the same place, but it >seems to me that they work hand-in-hand with the --recursive feature that >I don't want to use. > >The --null flag allows me to handle files with embedded linefeeds. This >is in the style of gnu find's -print0 operator. > >The --send-dirs overcomes a problem where rsync refuses to send directories >unless it's in recursive mode. One needs this to make sure that even >empty directories get mirrored. > >And the --no-implicit-dirs option turns off the default behavior in which >all the parent directories of a file are transmitted before sending the >file. That default behavior is very inefficient in my scenario where I >am taking the responsibility for sending those directories myself. > >So, the patch is attached. If you think it's an abomination, please let >me know what the better solution is. If you would like some elaboration >on how this stuff really works, please let me know. > >Cheers, >Andy > >Attachment Converted: C:\Eudora\Attach\rsync-2.4.6-srclist.patch > ----------------------- Lachlan M. D. Cranswick Collaborative Computational Project No 14 (CCP14) for Single Crystal and Powder Diffraction Birkbeck University of London and Daresbury Laboratory Postal Address: CCP14 - School of Crystallography, Birkbeck College, Malet Street, Bloomsbury, WC1E 7HX, London, UK Tel: (+44) 020 7631 6849 Fax: (+44) 020 7631 6803 E-mail: [EMAIL PROTECTED] WWW: http://www.ccp14.ac.uk/