Hi Niklaus,

You assume right.

Versioned objects are not accounted for by RGW automatic resharding, so that is 
likely the root cause. This is described here: 
https://tracker.ceph.com/issues/68206.
This is being backported to squid and reef, but it will no reach Pacific or 
Quincy, both being EOL.

Best you can do for now is to lower rgw_max_objs_per_shard to 25.000

Best regards,
Frédéric.

----- Le 19 Juin 25, à 14:36, Niklaus Hofer niklaus.ho...@stepping-stone.ch a 
écrit :

> Dear all
> 
> We are running Ceph Pacific (16.2.15) with RadosGW and have been getting
> "large omap objects" health warnings on the RadosGW index pool. Indeed
> we had one bucket in particular that was positively huge with 8'127'198
> objects that had just a single shard. But we have been seeing the
> message on some other buckets, too.
> 
> Eventually, we activated automatic resharding (rgw_dynamic_resharding =
> true) and indeed this bucket was resharded to now 167 shards. However, I
> am now getting even more large omap object warnings. On that same
> bucket, too. The other buckets have not been resharded at all. They are
> not in the queue, either:
> 
>| radosgw-admin reshard list
>> []
> 
>| grep 'Large omap object found' /var/log/ceph/ceph* | grep 'PG: ' | cut
> -d: -f 10 | cut -d. -f 4-5 | sort | uniq -c
>       2 7936686773.215
>      12 7937604172.149
>      10 7955243979.1209
>       9 7955243979.2480
>      13 7955243979.2481
>      12 7968198782.110
>      13 7968913553.67
>      11 7968913553.68
>      10 7968913553.69
>      11 7981210604.1
>      74 7981624399.1
>     217 7988881492.1
> 
> 
>| radosgw-admin metadata list --metadata-key bucket.instance | grep -i
> 7988881492
>>     "<bucket1_name>:<pool_name>.7988881492.1",
> 
>| radosgw-admin bucket stats --bucket <bucket1_name>
>> {
>>     "bucket": "<bucket1_name>",
>>     "num_shards": 167,
>> [...]
>>     "usage": {
>>         "rgw.main": {
>>             "size": 9669928611955,
>>             "size_actual": 9692804734976,
>>             "size_utilized": 9669928611955,
>>             "size_kb": 9443289661,
>>             "size_kb_actual": 9465629624,
>>             "size_kb_utilized": 9443289661,
>>             "num_objects": 8134437
>>         }
>>     },
> 
> Let's check another one, too:
> 
>| radosgw-admin metadata list --metadata-key bucket.instance | grep -i
> 7968198782.110
>>     "<bucket2_name>:<pool_name>.7968198782.110",
> 
>| radosgw-admin bucket stats --bucket <bucket2_name>
>> [...]
>>             "num_objects": 38690
>> [...]
> 
> According to the documentation in [0], buckets are resharded at a
> threshold of 100'000 objects per shard. For both of these, that applies
> nicely, so it makes sense that they are not getting resharded any further.
> 
> But why then am I getting these warnings?
> 
> Reading the documentation in [1], I can see that warnings are printed at
> 200'000 entries per omap object. Can I assume that one object in an RGW
> bucket means 1 entry in an omap object? Or is that a missconception?
> 
> Now, here is my working theory. Please let me know if that has any merit
> or if I'm completely off:
> 
> The affected buckets have versioning activated. Plus object locking too.
> They get used by a backup software (Kopia) that uses these features to
> provide ransomware protection. So my thinking is that maybe with
> versioning active, each object in a bucket could result in multiple omap
> entries, maybe one per version or something?
> 
> If that is the case, then maybe I should reduce `rgw_max_objs_per_shard`
> from 100'000 to something like 10'000 to have the buckets resharded more
> aggressively?
> 
> But then again, that assumes a lot. For example, that assumes that the
> num_objects counter in the bucket stats does not count up on versioned
> objects. So my assumption could be completely whack.
> 
> What do you think? What can I do to get rid of the large omap objects?
> Is more resharding going to help? What else could I check?
> 
> Sincerely
> 
> Niklaus Hofer
> 
> Links:
> [0]
> https://docs.ceph.com/en/latest/radosgw/dynamicresharding/#confval-rgw_max_objs_per_shard
> [1]
> https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects
> --
> stepping stone AG
> Wasserwerkgasse 7
> CH-3011 Bern
> 
> Telefon: +41 31 332 53 63
> www.stepping-stone.ch
> niklaus.ho...@stepping-stone.ch
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to