Hi Ben,

That was beyond helpful. Thank you so much for the thoughtful and
detailed explanation. That should definitely be added to the
documentation, until/unless the dynamic resharder/sharder handle this
case (if there is even desire to do so) with versioned objects.

Respectfully,
David

On Tue, Mar 30, 2021 at 12:21 AM Benoît Knecht <bkne...@protonmail.ch> wrote:
>
> Hi David,
>
> On Tuesday, March 30th, 2021 at 00:50, David Orman <orma...@corenode.com> 
> wrote:
> > Sure enough, it is more than 200,000, just as the alert indicates.
> > However, why did it not reshard further? Here's the kicker - we only
> > see this with versioned buckets/objects. I don't see anything in the
> > documentation that indicates this is a known issue with sharding, but
> > perhaps there is something going on with versioned buckets/objects. Is
> > there any clarity here/suggestions on how to deal with this? It sounds
> > like you expect this behavior with versioned buckets, so we must be
> > missing something.
>
> The issue with versioned buckets is that each object is associated with at 
> least 4 index entries, with 2 additional index entries for each version of 
> the object. Dynamic resharding is based on the number of objects, not the 
> number of index entries, and it counts each version of an object as an 
> object, so the biggest discrepancy between number of objects and index 
> entries happens when there's only one version of each object (factor of 4), 
> and it tends to a factor of two as the number of versions per object 
> increases to infinity. But there's one more special case. When you delete an 
> versioned object, it also creates two more index entries, but those are not 
> taken into account by dynamic resharding. Therefore, the absolute worst case 
> is when there was a single version of each object, and all the objects have 
> been deleted. In that case, there's 6 index entries for each object counted 
> by dynamic resharding, i.e. a factor of 6.
>
> So one way to "solve" this issue is to set 
> `osd_deep_scrub_large_omap_object_key_threshold=600000`, which (with the 
> default `rgw_max_objs_per_shard=100000`) will guarantee that dynamic 
> resharding will kick in before you get a large omap object warning even in 
> the worst case scenario for versioned buckets. If you're not comfortable 
> having that many keys per omap object, you could instead decrease 
> `rgw_max_objs_per_shard`.
>
> Cheers,
>
> --
> Ben
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to