Re: [ceph-users] max_bucket limit -- safe to disable?

Daniel Schneller Mon, 06 Oct 2014 10:06:01 -0700

Hi again!

We have done some tests regarding the limits of storing lots and 
lots of buckets through Rados Gateway into Ceph.


Our test used a single user for which we removed the default max
buckets limit. It then continuously created containers - both empty
and such with 10 objects of around 100k random data in them.

With 3 parallel processes we saw relatively consistent time of
about   500-700ms    per such container.

This kept steady until we reached approx. 3 million containers
after which the time per insert sharply went up to currently
around   1600ms   and rising. Due to some hiccups with network 
equipment the tests were aborted a few times, but then resumed without
deleting any of the previous runs created containers, so the actual
number might be 2.8 or 3.2 million, but still in that ballpark.
We aborted the test here. 

Judging by the advice given earlier (see quoted mail below) that
we might hit a limit on some per-user data structures, we created 
another user account, removed its max-bucket limit as well and
restarted the benchmark with that one, _expecting_ the times to be
down to the original range of 500-700ms.

However, what we are seeing is that the times stay at the   1600ms
and higher levels even for that fresh account.

Here is the output of `rados df`, reformatted to fit the email.
clones, degraded and unfound were 0 in all cases and have been
left out for clarity:

.rgw
=========================
       KB:     1,966,932
  objects:     9,094,552
       rd:   195,747,645
    rd KB:   153,585,472
       wr:    30,191,844
    wr KB:    10,751,065

.rgw.buckets
=========================
       KB: 2,038,313,855
  objects:    22,088,103
       rd:     5,455,123
    rd KB:   408,416,317
       wr:   149,377,728
    wr KB: 1,882,517,472

.rgw.buckets.index
=========================
       KB:             0
  objects:     5,374,376
       rd:   267,996,778
    rd KB:   262,626,106
       wr:   107,142,891
    wr KB:             0

.rgw.control
=========================
       KB:             0
  objects:             8
       rd:             0
    rd KB:             0
       wr:             0
    wr KB:             0

.rgw.gc
=========================
       KB:             0
  objects:            32
       rd:     5,554,407
    rd KB:     5,713,942
       wr:     8,355,934
    wr KB:             0

.rgw.root
=========================
       KB:             1
  objects:             3
       rd:           524
    rd KB:           346
       wr:             3
    wr KB:             3


We would very much like to understand what is going on here 
in order to decide if Rados Gateway is a viable option to base
our production system on (where we expect similar counts
as in the benchmark), or if we need to investigate using librados
directly which we would like to avoid if possible.

Any advice on what configuration parameters to check or
which additional information to provide to analyze this would be
very much welcome.

Cheers,
Daniel


-- 
Daniel Schneller
Mobile Development Lead

CenterDevice GmbH                  | Merscheider Straße 1
                                  | 42699 Solingen
tel: +49 1754155711                | Deutschland
daniel.schnel...@centerdevice.com <mailto:daniel.schnel...@centerdevice.com>  | 
www.centerdevice.com <http://www.centerdevice.com/>




> On 10 Sep 2014, at 19:42, Gregory Farnum <g...@inktank.com> wrote:
> 
> On Wednesday, September 10, 2014, Daniel Schneller 
> <daniel.schnel...@centerdevice.com 
> <mailto:daniel.schnel...@centerdevice.com>> wrote:
> On 09 Sep 2014, at 21:43, Gregory Farnum <g...@inktank.com <>> wrote:
> 
> 
>> Yehuda can talk about this with more expertise than I can, but I think
>> it should be basically fine. By creating so many buckets you're
>> decreasing the effectiveness of RGW's metadata caching, which means
>> the initial lookup in a particular bucket might take longer.
> 
> Thanks for your thoughts. With “initial lookup in a particular bucket”
> do you mean accessing any of the objects in a bucket? If we directly
> access the object (not enumerating the buckets content), would that
> still be an issue?
> Just trying to understand the inner workings a bit better to make
> more educated guesses :)
> 
> When doing an object lookup, the gateway combines the "bucket ID" with a 
> mangled version of the object name to try and do a read out of RADOS. It 
> first needs to get that bucket ID though -- it will cache an the bucket 
> name->ID mapping, but if you have a ton of buckets there could be enough 
> entries to degrade the cache's effectiveness. (So, you're more likely to pay 
> that extra disk access lookup.)
>  
> 
> 
>> The big concern is that we do maintain a per-user list of all their
>> buckets — which is stored in a single RADOS object — so if you have an
>> extreme number of buckets that RADOS object could get pretty big and
>> become a bottleneck when creating/removing/listing the buckets. You
> 
> Alright. Listing buckets is no problem, that we don’t do. Can you
> say what “pretty big” would be in terms of MB? How much space does a
> bucket record consume in there? Based on that I could run a few numbers.
> 
> Uh, a kilobyte per bucket? You could look it up in the source (I'm on my 
> phone) but I *believe* the bucket name is allowed to be larger than the rest 
> combined...
> More particularly, though, if you've got a single user uploading documents, 
> each creating a new bucket, then those bucket creates are going to serialize 
> on this one object.
> -Greg
>  
> 
> 
>> should run your own experiments to figure out what the limits are
>> there; perhaps you have an easy way of sharding up documents into
>> different users.
> 
> Good advice. We can do that per distributor (an org unit in our
> software) to at least compartmentalize any potential locking issues
> in this area to that single entity. Still, there would be quite
> a lot of buckets/objects per distributor, so some more detail on
> the above items would be great.
> 
> Thanks a lot!
> 
> 
> Daniel
> 
> 
> -- 
> Software Engineer #42 @ http://inktank.com <http://inktank.com/> | 
> http://ceph.com <http://ceph.com/>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] max_bucket limit -- safe to disable?

Reply via email to