[ceph-users] RadosGW: Even more large omap objects after resharding

Niklaus Hofer Thu, 19 Jun 2025 05:38:06 -0700

Dear all

We are running Ceph Pacific (16.2.15) with RadosGW and have been getting"large omap objects" health warnings on the RadosGW index pool. Indeedwe had one bucket in particular that was positively huge with 8'127'198objects that had just a single shard. But we have been seeing themessage on some other buckets, too.

Eventually, we activated automatic resharding (rgw_dynamic_resharding =true) and indeed this bucket was resharded to now 167 shards. However, Iam now getting even more large omap object warnings. On that samebucket, too. The other buckets have not been resharded at all. They arenot in the queue, either:


| radosgw-admin reshard list

[]

      2 7936686773.215
     12 7937604172.149
     10 7955243979.1209
      9 7955243979.2480
     13 7955243979.2481
     12 7968198782.110
     13 7968913553.67
     11 7968913553.68
     10 7968913553.69
     11 7981210604.1
     74 7981624399.1
    217 7988881492.1

| radosgw-admin metadata list --metadata-key bucket.instance | grep -i7988881492

    "<bucket1_name>:<pool_name>.7988881492.1",


| radosgw-admin bucket stats --bucket <bucket1_name>

{
    "bucket": "<bucket1_name>",
    "num_shards": 167,
[...]
    "usage": {
        "rgw.main": {
            "size": 9669928611955,
            "size_actual": 9692804734976,
            "size_utilized": 9669928611955,
            "size_kb": 9443289661,
            "size_kb_actual": 9465629624,
            "size_kb_utilized": 9443289661,
            "num_objects": 8134437
        }
    },


Let's check another one, too:

| radosgw-admin metadata list --metadata-key bucket.instance | grep -i7968198782.110

    "<bucket2_name>:<pool_name>.7968198782.110",


| radosgw-admin bucket stats --bucket <bucket2_name>

[...]
            "num_objects": 38690
[...]

According to the documentation in [0], buckets are resharded at athreshold of 100'000 objects per shard. For both of these, that appliesnicely, so it makes sense that they are not getting resharded any further.


But why then am I getting these warnings?

Reading the documentation in [1], I can see that warnings are printed at200'000 entries per omap object. Can I assume that one object in an RGWbucket means 1 entry in an omap object? Or is that a missconception?

Now, here is my working theory. Please let me know if that has any meritor if I'm completely off:

The affected buckets have versioning activated. Plus object locking too.They get used by a backup software (Kopia) that uses these features toprovide ransomware protection. So my thinking is that maybe withversioning active, each object in a bucket could result in multiple omapentries, maybe one per version or something?

If that is the case, then maybe I should reduce `rgw_max_objs_per_shard`from 100'000 to something like 10'000 to have the buckets resharded moreaggressively?

But then again, that assumes a lot. For example, that assumes that thenum_objects counter in the bucket stats does not count up on versionedobjects. So my assumption could be completely whack.

What do you think? What can I do to get rid of the large omap objects?Is more resharding going to help? What else could I check?


Sincerely

Niklaus Hofer

Links:

[0]https://docs.ceph.com/en/latest/radosgw/dynamicresharding/#confval-rgw_max_objs_per_shard[1]https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects

--
stepping stone AG
Wasserwerkgasse 7
CH-3011 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
niklaus.ho...@stepping-stone.ch
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] RadosGW: Even more large omap objects after resharding

Reply via email to