For the S3 side of the comparison, Amazon stores an MD5SUM value for each
file, and returns it in the directory listing as the ETag.

For the local side of the comparison, if using code in the github master
branch (1.5.0-alpha3+), you can specify a local --cache-file=(some local
file) where MD5SUM values are stored the first time the file is synced to
S3, so it doesn't have to be read in during future runs to compare.  If the
file's inode changes (modified date, size), it is recalculated on the next
sync.  This greatly speeds up the process as all the local files don't need
to be read in, just the cache file, while preserving the comparison of
using the MD5SUM values.  One can also skip using the MD5SUM values with
--no-check-md5, if you know already-uploaded files haven't changed.

Older versions of s3cmd don't have the --cache-file option.

There are some very new patches pending to the master branch which can
correctly handle the S3 side MD5SUM value even for files uploaded as
multipart (for most s3cmd invocations, files > 5MB).  Otherwise, on older
versions, no MD5SUM comparison is done for such multipart files.

Thanks,
Matt


On Sun, Aug 18, 2013 at 2:30 PM, Phill Coxon <phillco...@gmail.com> wrote:

> Hi there.
>
> I've been using s3cmd for years but have only just started using s3cmd
> --sync to keep a 5Gb folder with 19,000 images synced with S3.
>
> Would someone clarify how the sync option works?
>
> On each run does it calculate a md5 of every local file and then pull
> the corresponding md5 from the S3 metadata to compare?
>
> Or does it store the local md5s somewhere and only recalculates when a
> local file changes?
>
> The reason I ask is that running the sync can take a long time (40+
> minutes) before any changes are listed and start to sync.  The hosting
> company my client is with charges horrendously for international
> bandwidth (we're based in New Zealand) so I want to make sure that doing
> a sync only uses a small amount of bandwidth (pulling md5s etc) to make
> the comparison before uploading any changes.
>
> Thanks!
>
>
>
>
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite!
> It's a free troubleshooting tool designed for production.
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
> _______________________________________________
> S3tools-general mailing list
> S3tools-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/s3tools-general
>
>
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
S3tools-general mailing list
S3tools-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/s3tools-general

Reply via email to