On Wed, Feb 12, 2003 at 01:13:45AM -0600, Adam Herbert wrote: > I need some suggestions. Here's my setup: > > 800GB of Data > 14,000,000+ Files > No changes just additions > Files range in size from 30k - 190k > > The files are laid out in a tree fashion like: > > BASE > \-Directory ( Numerical Directory name from 0 - 1023 ) > \-Directory ( Numerical Directory name from 0 - 1023 ) > \- Files ( Up to 1024 files each directory ) > > > This allows for a maximum of about a billion files. I need to limit the > amount of memory usage and processor / io time it takes to build the > list of files to transmit. Is there a better solution that rsync? Are > the patches that would help rsync in my particular situation?
Rsync's real advantage is when files change. In this case that is moot. Certainly using rsync on the whole thing at once will probably use more memory than you want. You could loop through the second level directories with rsync. My inclination here would be to roll your own. Something as simple as a touch $newstamp cd $BASE find . -newer $laststamp | cpio -oH crc|ssh $dest 'cd $BASE; cpio -idum' mv $newstamp $laststamp may be sufficient. Building the filelist by using comm -23 on the sorted outputs of "find . -type f -print" on source and dest may be more reliable. For that matter it might be worthwhile building the infrastructure to replicate the files at creation time. The structure you describe indicates to me that the files are created by an automated process, build the replication into that. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html