The main problem with efficiently listing many-sharded buckets is the
requirement to provide entries in sorted order. This means that each
http request has to fetch ~1000 entries from every shard, combine them
into a sorted order, and throw out the leftovers. The next request to
continue the listing will advance its position slightly, but still end
up fetching many of the same entries from each shard. As the number of
shards increases, the more these shard listings will overlap, and the
performance falls off.
Eric Ivancich recently added s3 and swift extensions for unordered
bucket listing in https://github.com/ceph/ceph/pull/21026 (for mimic).
That allows radosgw to list each shard separately, and avoid the step
that throws away extra entries. If your application can tolerate
unsorted listings, that could be a big help without having to resort to
indexless buckets.
On 05/01/2018 11:09 AM, Robert Stanford wrote:
I second the indexless bucket suggestion. The downside being that
you can't use bucket policies like object expiration in that case.
On Tue, May 1, 2018 at 10:02 AM, David Turner <drakonst...@gmail.com
<mailto:drakonst...@gmail.com>> wrote:
Any time using shared storage like S3 or cephfs/nfs/gluster/etc
the absolute rule that I refuse to break is to never rely on a
directory listing to know where objects/files are. You should be
maintaining a database of some sort or a deterministic naming
scheme. The only time a full listing of a directory should be
required is if you feel like your tooling is orphaning files and
you want to clean them up. If I had someone with a bucket with 2B
objects, I would force them to use an index-less bucket.
That's me, though. I'm sure there are ways to manage a bucket in
other ways, but it sounds awful.
On Tue, May 1, 2018 at 10:10 AM Robert Stanford
<rstanford8...@gmail.com <mailto:rstanford8...@gmail.com>> wrote:
Listing will always take forever when using a high shard
number, AFAIK. That's the tradeoff for sharding. Are those
2B objects in one bucket? How's your read and write
performance compared to a bucket with a lower number
(thousands) of objects, with that shard number?
On Tue, May 1, 2018 at 7:59 AM, Katie Holly <8ld3j...@meo.ws
<mailto:8ld3j...@meo.ws>> wrote:
One of our radosgw buckets has grown a lot in size, `rgw
bucket stats --bucket $bucketname` reports a total of
2,110,269,538 objects with the bucket index sharded across
32768 shards, listing the root context of the bucket with
`s3 ls s3://$bucketname` takes more than an hour which is
the hard limit to first-byte on our nginx reverse proxy
and the aws-cli times out long before that timeout limit
is hit.
The software we use supports sharding the data across
multiple s3 buckets but before I go ahead and enable this,
has anyone ever had that many objects in a single RGW
bucket and can let me know how you solved the problem of
RGW taking a long time to read the full index?
--
Best regards
Katie Holly
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com