I've filed several pull requests, with no response, so I figured I'd
advertise the features here for comment.  These are capabilities
needed by Fedora Infrastructure to mirror content out to the (now 4)
mirrors hosted in S3 for in-region EC2 consumers.

These are all available in my github repo:
https://github.com/mdomsch/s3cmd

1) Apply excludes during os.walk() so you don't beat up on the local
   file system for directories and files you know you'll never
   transfer. [sync]

2) add --delete-after [sync].  S3 is effectively unlimited in size,
   and when we get to hardlinks, deleting files prematurely will hurt
   our ability to simply COPY them from one place to another.

3) add --delay-updates [sync].  This transfers updated files after new
   files (useful for the yum repomd.xml file that you want to change
   only after putting the new content it references into place).

4) Recognize hardlinks and use remote COPY commands instead.  This
   reduces the amount of files transferred significantly in trees that
   use hardlinks extensively, such as a Fedora yum repository. [sync]
   It also helps with files that simply move from one directory to
   another but don't change in the process (e.g. from Fedora
   updates-testing to updates-released).

5) Sync to multiple S3 buckets in parallel.  After the initial local
   tree walk to figure out what is available to transfer, fork into
   separate processes for each remote S3 bucket, allowing transfers in
   parallel. [sync]

6) Cache local MD5 values on disk.  This avoids needing to read each
   local file to calculate its MD5 value again to compare against what
   S3 reports.


I'd be happy to post a branch that is all of these combined, as that's
what Fedora Infrastructure is using now.


Thanks,
Matt


-- 
Matt Domsch
Technology Strategist
Dell | Office of the CTO

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
S3tools-general mailing list
S3tools-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/s3tools-general

Reply via email to