Later releases do improve dynamic resharding FWIW.

Also, do you have buckets using versioned objects?  If so I would suggest 
lowering rgw_max_objs_per_shard to, say, 50000. Upcoming releases will improve 
that dynamic.


> On Jun 19, 2025, at 9:07 AM, Niklaus Hofer <niklaus.ho...@stepping-stone.ch> 
> wrote:
> 
> Dear Eugen
> 
> Thank you for your input. What you say is true, of course. I've read that 
> some place too. I already ran a deep-scrub on all the involved PGs and that's 
> when the number of warnings increased drastically. I assume that before I 
> just had one positively huge omap object for that pool. Now I have 167 omap 
> objects that are not quite as big, but still too large.
> 
> Sincerely
> 
> Niklaus Hofer
> 
> On 19/06/2025 14.48, Eugen Block wrote:
>> Hi,
>> the warnings about large omap objects are reported when deep-scrubs happen. 
>> So if you resharded the bucket (or Ceph did that for you), you'll either 
>> have to wait for the deep-scrub schedule to scrub the affected PGs, or you 
>> issue a manual deep-scrub on that PG or the entire pool.
>> Regards,
>> Eugen
>> Zitat von Niklaus Hofer <niklaus.ho...@stepping-stone.ch>:
>>> Dear all
>>> 
>>> We are running Ceph Pacific (16.2.15) with RadosGW and have been getting 
>>> "large omap objects" health warnings on the RadosGW index pool. Indeed we 
>>> had one bucket in particular that was positively huge with 8'127'198 
>>> objects that had just a single shard. But we have been seeing the message 
>>> on some other buckets, too.
>>> 
>>> Eventually, we activated automatic resharding (rgw_dynamic_resharding = 
>>> true) and indeed this bucket was resharded to now 167 shards. However, I am 
>>> now getting even more large omap object warnings. On that same bucket, too. 
>>> The other buckets have not been resharded at all. They are not in the 
>>> queue, either:
>>> 
>>> | radosgw-admin reshard list
>>>> []
>>> 
>>> | grep 'Large omap object found' /var/log/ceph/ceph* | grep 'PG: ' | cut 
>>> -d: -f 10 | cut -d. -f 4-5 | sort | uniq -c
>>>       2 7936686773.215
>>>      12 7937604172.149
>>>      10 7955243979.1209
>>>       9 7955243979.2480
>>>      13 7955243979.2481
>>>      12 7968198782.110
>>>      13 7968913553.67
>>>      11 7968913553.68
>>>      10 7968913553.69
>>>      11 7981210604.1
>>>      74 7981624399.1
>>>     217 7988881492.1
>>> 
>>> 
>>> | radosgw-admin metadata list --metadata-key bucket.instance | grep -i 
>>> 7988881492
>>>>    "<bucket1_name>:<pool_name>.7988881492.1",
>>> 
>>> | radosgw-admin bucket stats --bucket <bucket1_name>
>>>> {
>>>>    "bucket": "<bucket1_name>",
>>>>    "num_shards": 167,
>>>> [...]
>>>>    "usage": {
>>>>        "rgw.main": {
>>>>            "size": 9669928611955,
>>>>            "size_actual": 9692804734976,
>>>>            "size_utilized": 9669928611955,
>>>>            "size_kb": 9443289661,
>>>>            "size_kb_actual": 9465629624,
>>>>            "size_kb_utilized": 9443289661,
>>>>            "num_objects": 8134437
>>>>        }
>>>>    },
>>> 
>>> Let's check another one, too:
>>> 
>>> | radosgw-admin metadata list --metadata-key bucket.instance | grep -i 
>>> 7968198782.110
>>>>    "<bucket2_name>:<pool_name>.7968198782.110",
>>> 
>>> | radosgw-admin bucket stats --bucket <bucket2_name>
>>>> [...]
>>>>            "num_objects": 38690
>>>> [...]
>>> 
>>> According to the documentation in [0], buckets are resharded at a threshold 
>>> of 100'000 objects per shard. For both of these, that applies nicely, so it 
>>> makes sense that they are not getting resharded any further.
>>> 
>>> But why then am I getting these warnings?
>>> 
>>> Reading the documentation in [1], I can see that warnings are printed at 
>>> 200'000 entries per omap object. Can I assume that one object in an RGW 
>>> bucket means 1 entry in an omap object? Or is that a missconception?
>>> 
>>> Now, here is my working theory. Please let me know if that has any merit or 
>>> if I'm completely off:
>>> 
>>> The affected buckets have versioning activated. Plus object locking too. 
>>> They get used by a backup software (Kopia) that uses these features to 
>>> provide ransomware protection. So my thinking is that maybe with versioning 
>>> active, each object in a bucket could result in multiple omap entries, 
>>> maybe one per version or something?
>>> 
>>> If that is the case, then maybe I should reduce `rgw_max_objs_per_shard` 
>>> from 100'000 to something like 10'000 to have the buckets resharded more 
>>> aggressively?
>>> 
>>> But then again, that assumes a lot. For example, that assumes that the 
>>> num_objects counter in the bucket stats does not count up on versioned 
>>> objects. So my assumption could be completely whack.
>>> 
>>> What do you think? What can I do to get rid of the large omap objects? Is 
>>> more resharding going to help? What else could I check?
>>> 
>>> Sincerely
>>> 
>>> Niklaus Hofer
>>> 
>>> Links:
>>> [0] https://docs.ceph.com/en/latest/radosgw/dynamicresharding/ 
>>> #confval-rgw_max_objs_per_shard
>>> [1] https://docs.ceph.com/en/latest/rados/operations/health-checks/ 
>>> #large-omap-objects
>>> -- 
>>> stepping stone AG
>>> Wasserwerkgasse 7
>>> CH-3011 Bern
>>> 
>>> Telefon: +41 31 332 53 63
>>> www.stepping-stone.ch
>>> niklaus.ho...@stepping-stone.ch
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> -- 
> Managed Kubernetes (Container as a Service) kostenlos testen!
> 
> Jetzt bestellen und 500 Franken Startguthaben sichern:
> https://www.stoney-cloud.com/caas/
> 
> Arbeitszeiten: Dienstag bis Freitag
> 
> stepping stone AG
> Wasserwerkgasse 7
> CH-3011 Bern
> 
> Telefon: +41 31 332 53 63
> www.stepping-stone.ch
> niklaus.ho...@stepping-stone.ch
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to