[S3tools-general] Out of memory: Kill process s3cmd - v1.5.0-beta1

2014-03-06 Thread WagnerOne
Hi, I was recently charged with moving a lot of data (TBs) into s3 and discovered the great tool that is s3cmd. It's working well and I like the familiar rsync-like interactions. I'm attempting to use s3cmd to copy a directory with tons of small files amounting to about 700GB to s3. During my

Re: [S3tools-general] Out of memory: Kill process s3cmd - v1.5.0-beta1

2014-03-10 Thread WagnerOne
es. One option would be > to add in a sqlite on-disk or in-memory database for transient use in storing > and comparing the local and remote file lists, but that's a fairly heavy > undertaking and not one anyone has chosen to develop. > > Thanks, > Matt > > >

Re: [S3tools-general] Out of memory: Kill process s3cmd - v1.5.0-beta1

2014-03-10 Thread WagnerOne
self, given how python operates. One option would be > to add in a sqlite on-disk or in-memory database for transient use in storing > and comparing the local and remote file lists, but that's a fairly heavy > undertaking and not one anyone has chosen to develop. > > Thanks

[S3tools-general] empty directories not syncing - known issue

2014-03-11 Thread WagnerOne
Hi, I noticed empty directories on my source (not s3) side aren't making it to my target, s3 side. I searched the archives and saw this is a known issue. Is there a timeline for when this may be resolved? Also, what is the current behavior if I create the required empty directories in s3 manu

Re: [S3tools-general] Out of memory: Kill process s3cmd - v1.5.0-beta1

2014-03-11 Thread WagnerOne
Mar 10, 2014 at 6:07 PM, WagnerOne wrote: > I've identified the subdir in my content to be transferred with the huge file > count that I need to systematically transfer. > > Will --exclude allow me to sync everything but said directory, so I can then > work within that s

Re: [S3tools-general] empty directories not syncing - known issue

2014-03-12 Thread WagnerOne
t s3 handles names that end with / > differently than names that are the same without the /. Could you have the > entry above and also store data in > > bucket.s3.com/four > > ? Only by experimenting could you determine that. > > > J. Merrill > > -Original

Re: [S3tools-general] empty directories not syncing - known issue

2014-03-12 Thread WagnerOne
reate them on > upload and create them on download? Sure, but that would require changing > how S3 thinks about objects to include directories as well as files. That's > not a trivial undertaking, for what is really a corner case added by S3 > later. I'd be happy to re

[S3tools-general] how to disable "remote copy" feature

2014-03-12 Thread WagnerOne
While this feature is fantastic, I can't find a lot of detail on it in general. I wonder how to disable it? During initial uploads at least, our DirectConnect link seems to be faster in copying the files themselves than s3cmd is at telling S3 to "remote copy" objects. Would that simply be usin

Re: [S3tools-general] how to disable "remote copy" feature

2014-04-03 Thread WagnerOne
so means --no-check-md5 won't, as you might expect, disable remote > copying. As no one has asked to be able to disable remote copying, I never > coded for it. > > I'll think about this a bit. There's probably a cleaner way to solve both > problems. > >

Re: [S3tools-general] how to disable "remote copy" feature

2014-04-03 Thread WagnerOne
appreciate the "INFO: " addition! Mike On Apr 3, 2014, at 1:14 PM, WagnerOne wrote: > Hi Matt, > > Sorry for the delay in testing and responding. I appreciate the effort you > put in this to date. > > When I run a s3cmd sync like so: > > s3cmd sync --no

[S3tools-general] IAM role vs user key/secret key - possible bug

2014-04-03 Thread WagnerOne
Hi, I believe I may have uncovered a bug regarding using an IAM role vs. a user key/secret key combination. The instance I'm using s3cmd on has an IAM role allowing it write access to an s3 bucket. At some point after having that IAM role assigned, it was deprecated in favor of using a user

[S3tools-general] bug encountered possibly related to "empty object" in S3

2014-04-10 Thread WagnerOne
Hi, Encountered what appears to be a bug today. I am syncing a local directory and an s3 prefix that I have not been in control of (unlike the many other s3cmd syncs I have done successfully). When trying to sync existing local directories with prefixes in this bucket, I am encountering 2 thi

[S3tools-general] An unexpected error has occurred - while syncing bucket to bucket

2014-04-10 Thread WagnerOne
Hi, When attempting to s3cmd sync 2 buckets today, I encountered this "An unexpected error has occurred." output from s3cmd itself. I am using the latest available s3cmd master branch code via a git clone (updated today). I tried this several times within the 2 same buckets on different prefi

Re: [S3tools-general] bug encountered possibly related to "empty object" in S3

2014-04-10 Thread WagnerOne
Running this same sync in debug, I see additional detail following the "INFO: Summary: ..." line. I'm not sure what I should anonymize in that output, so I'd prefer to share it with a dev only. I can produce that on request. Mike On Apr 10, 2014, at 3:21 PM, Wag

[S3tools-general] s3cmd --dry-run ctrl-c hangs

2014-04-10 Thread WagnerOne
Mostly just an observation to report... occasionally when I stop s3cmd with ctrl-c when also running under --dry-run, I'll see a "Cleaning up. Please wait..." "Completed parts..." And then it hangs indefinitely until I kill the process. This seems to happen more often with dry-run than it

Re: [S3tools-general] s3cmd --dry-run ctrl-c hangs

2014-04-10 Thread WagnerOne
Please ignore this. I was observing aws cli - not s3cmd when I saw this. Mike On Apr 10, 2014, at 5:11 PM, WagnerOne wrote: > Mostly just an observation to report... occasionally when I stop s3cmd with > ctrl-c when also running under --dry-run, I'll see a > > "Cl

Re: [S3tools-general] bug encountered possibly related to "empty object" in S3

2014-04-11 Thread WagnerOne
ome extra test coverage. > > Thanks, > Matt > > > On Thu, Apr 10, 2014 at 4:48 PM, WagnerOne wrote: > Running this same sync in debug, I see additional detail following the "INFO: > Summary: ..." line. > > I'm not sure what I should anonymize in that ou

[S3tools-general] question regarding --no-check-md5

2014-04-13 Thread WagnerOne
Hi, The man page states the following: --no-check-md5 Do not check MD5 sums when comparing files for [sync]. Only size will be compared. May significantly speed up transfer but may also miss some changed files. When this says "only size will be compared", I'm taking it to mean only "size

Re: [S3tools-general] question regarding --no-check-md5

2014-04-14 Thread WagnerOne
y read the local file once > and then read its md5 out of the cache until it changes, so the HEAD isn't > cheaper in general. > > To detect a change to a file whose size hasn't changed, but its content has, > we have to do the HEAD call, and calculate the MD5 of the

[S3tools-general] aws and s3cmd - huge object count syncs, mod dates

2014-04-14 Thread WagnerOne
I've struggled with some huge object count transfers and went back and forth between aws s3 cli and s3cmd. aws s3 cli seems to edge s3cmd out on speed and RAM consumption when doing huge object count transfers. However, aws s3 cli seems to choke on large object counts and s3cmd offers so much m

Re: [S3tools-general] An unexpected error has occurred - while syncing bucket to bucket

2014-04-21 Thread WagnerOne
n reliably reproduce, > please send me results with --debug enabled, privately. > > Thanks, > Matt > > > On Thu, Apr 10, 2014 at 3:30 PM, WagnerOne wrote: > Hi, > > When attempting to s3cmd sync 2 buckets today, I encountered this "An > unexpected error ha

[S3tools-general] s3cmd object count to sync error

2014-04-21 Thread WagnerOne
example s3cmd for the below /usr/bin/s3cmd -c /home/ec2-user/.s3cfg sync --delete-removed --no-preserve --verbose --progress /localdirectory/ s3://mybucket/directory/ syncs were running build commit d52d5edcc916512e979917f04abcea19d3a25af7 Date: Sat Apr 12 20:40:16 2014 -0500 s3cmd, when sy

Re: [S3tools-general] question regarding --no-check-md5

2014-04-23 Thread WagnerOne
Thank you, Matt, for inspecting that and for the continued explanation. If I have a local and S3 file pair and the local file is modified such that size is not modified, but its date is, a sync with aws cli would copy that modified local file over the existing s3 counterpart (due to the source f

Re: [S3tools-general] s3cmd object count to sync error

2014-04-23 Thread WagnerOne
> -n, total_size = _upload(update_list, n, local_count, total_size) > +n, total_size = _upload(local_list, 0, upload_count, total_size) > +n, total_size = _upload(update_list, n, upload_count, total_size) > n_copies, saved_bytes, failed_copy_files = re

Re: [S3tools-general] question regarding --no-check-md5

2014-04-23 Thread WagnerOne
t follow > suit and compare local mtime with S3 , and upload if mtime is > newer. > > it's been that way since Michal first wrote the initial sync code back in > September 2007. Doesn't mean it has to stay that way. > > > On Wed, Apr 23, 2014 at 8:29 AM, Wagner

[S3tools-general] multipart Etag during sync

2014-05-18 Thread WagnerOne
Hi, During some of my initial S3 uploads using s3cmd, I didn't realize S3 behaved differently in terms of the hash it stores for multipart uploads compared to normal uploads. I since bumped my s3cfg setting for multipart threshold well beyond any of my file sizes so multipart uploading doesn't

Re: [S3tools-general] question regarding --no-check-md5

2014-05-19 Thread WagnerOne
change to a file whose size hasn't changed, but its content has, > we have to do the HEAD call, and calculate the MD5 of the local file (and use > --cache-file to record that for posterity), and compare. > > > > > On Sun, Apr 13, 2014 at 5:08 PM, WagnerOne wrote: >

[S3tools-general] please verify my understanding of how s3cmd mime type assignment works

2014-05-21 Thread WagnerOne
Hello, I am attempting to verify my understanding of how mime type assignment works in s3cmd. I'm hoping this post will make this detail easier to find for others seeking it too. I looked at the S3.py code and surmised that the "python-magic" module will attempt to be used if present. If it is