Ok so it appears I don't have a problem with a limit of 1000 in the REST
API ls facility if I use s3cmd as s3cmd isn't restricted by this. Regarding
multipart upload I won't be needing more than 1000 parts as I will leave
the 15MB chunk size alone and I don't have any files anywhere near as big
as 15GB.
Thanks for the responses. There's a remaining question about MD5s which you
might be able to answer - please see my other thread (with the missing
subject line)
Russell
On 22 March 2015 at 23:45, Matt Domsch <m...@domsch.com> wrote:
> The 1000 per request limit doesn't limit the number of objects in a bucket
> or their names. It's solely to make sure the REST API doesn't get bogged
> down trying to return 1M objects in a list at once. When a bucket has more
> than 1000 files, it returns the first 1000, with an <IsTruncated/> tag, and
> a marker to indicate the next object to return in a subsequent related
> call, starting from the marker. You just have to issue a second call.
> s3cmd does this automatically for 'ls' and most list operations. It's
> broken trying to list the parts of a multipart upload when there are more
> than 1000 parts for a given object; it doesn't issue the subsequent calls.
>
>
> On Sun, Mar 22, 2015 at 6:39 PM, Russell Gadd <rust...@gmail.com> wrote:
>
>> Thanks Matt. I suspected the files-from wouldn't work with ls.
>>
>> But your mention of a limit of 1000 on list operations worries me, as my
>> plan was to put about 25000 files into one folder with a name as just the
>> 32 hex character MD5 with no subfolder heirarchy. It would seem that I
>> could then not get a list of all the objects. I was totally unaware of this
>> arbitrary limit.
>>
>> (I'm thinking aloud now)
>> One method would perhaps be to issue 256 requests for all items beginning
>> with 2 characters from 00 to ff. I have about 30000 files so this would
>> give me an average of about 120 files per subset although there could be an
>> outlier 2 character prefix with a number of files above 1000. In that case
>> would I get just 1000 responses? And I don't think there'd be an obvious
>> way to get the remaining files. However if they came in alphabetic order
>> perhaps I'd only have to issue a few more requests to get files beginning
>> say xyd to xyf assuming I already had say all xy0 to xyc. Sounds like an
>> exercise in recursive algorithms. Actually if done right it will be much
>> less than 256.
>>
>> Maybe I need to rethink, but I won't be looking for a complete list very
>> often so perhaps it might be ok.
>>
>> Russell
>>
>> On 22 March 2015 at 21:22, Matt Domsch <m...@domsch.com> wrote:
>>
>>> [ls] doesn't honor the --files-from option. [ls] simply asks S3 for all
>>> the files in a bucket, possibly recursively, starting from a given prefix.
>>>
>>> Jeremy is correct that it doesn't matter if a request returns 0 bytes or
>>> a list of 1000 objects, it's counted as one request. Most operations have
>>> a limit as to the number of items they can operate on (e.g. list bucket and
>>> multiple object delete have a limit of 1000 objects for each
>>> operation/request). If though, given a list of 1000 objects, we do a
>>> metadata HEAD request for each object, then you'll have made 1001
>>> requests. (we don't get metadata for every object anymore though, only
>>> when we need it).
>>>
>>> On Sun, Mar 22, 2015 at 3:58 PM, Russell Gadd <rust...@gmail.com> wrote:
>>>
>>>> I'm wondering if someone could help explain:
>>>>
>>>> 1. Can you tell me if --files-from is an available option for the ls
>>>> command? I've experimented to find out but without success. (Example: s3cmd
>>>> -r --files-from=testlist.txt ls s3://xyztestbucket). Probably not but I
>>>> just wanted to check. It's not clear in the documentation although I
>>>> suspect most people probably haven't got a use for it. So please confirm
>>>> that --files-from doesn't apply to ls or else tell me how to specify the
>>>> command and the list of files (i.e. is s3//bucket-name required at the
>>>> front of each file). In my proposed usage it would be useful as it would
>>>> verify the existence of specific files. If not available I will have to
>>>> issue one command per file unless I list the whole lot since I'm not using
>>>> folders.
>>>>
>>>> 2. I'm not sure of the meaning of "requests" in the pricing of get or
>>>> list requests, which for EU-West is $.004 / 1000 requests.
>>>> Does this mean $.004 for a request which returns 1000 file names or
>>>> literally 1000 lists each of which could return any number of filenames?
>>>> Actually it's probably small beer for my usage but it would be nice to
>>>> know.
>>>>
>>>> Russell
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Dive into the World of Parallel Programming The Go Parallel Website,
>>>> sponsored
>>>> by Intel and developed in partnership with Slashdot Media, is your hub
>>>> for all
>>>> things parallel software development, from weekly thought leadership
>>>> blogs to
>>>> news, videos, case studies, tutorials and more. Take a look and join the
>>>> conversation now. http://goparallel.sourceforge.net/
>>>> _______________________________________________
>>>> S3tools-general mailing list
>>>> S3tools-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/s3tools-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Dive into the World of Parallel Programming The Go Parallel Website,
>>> sponsored
>>> by Intel and developed in partnership with Slashdot Media, is your hub
>>> for all
>>> things parallel software development, from weekly thought leadership
>>> blogs to
>>> news, videos, case studies, tutorials and more. Take a look and join the
>>> conversation now. http://goparallel.sourceforge.net/
>>> _______________________________________________
>>> S3tools-general mailing list
>>> S3tools-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/s3tools-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming The Go Parallel Website,
>> sponsored
>> by Intel and developed in partnership with Slashdot Media, is your hub
>> for all
>> things parallel software development, from weekly thought leadership
>> blogs to
>> news, videos, case studies, tutorials and more. Take a look and join the
>> conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> S3tools-general mailing list
>> S3tools-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/s3tools-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> S3tools-general mailing list
> S3tools-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/s3tools-general
>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
S3tools-general mailing list
S3tools-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/s3tools-general