Dear Anthony

Later releases do improve dynamic resharding FWIW.

We are hoping to upgrade to Reef at the end of the summer. But until then we would really love a solution. The huge omap object we've had before caused a lot of headaches (slow ops, hung OSDs, PGs that never completed backfilling).

Also, do you have buckets using versioned objects?  If so I would suggest 
lowering rgw_max_objs_per_shard to, say, 50000.

Yes, we do indeed have versioning (and object locking, too) active on these buckets.

Decreasing rgw_max_objs_per_shard goes along with what I was thinking, too. Glad to hear others are thinking alike.

I think I saw an object map with 1.3M object references, so I guess 50'000 might still be too high. But we'll probably do 50'000 anyway at first and see whether it helps at all.

I'll definitely let you know how it's going!

Sincerely

Niklaus Hofer

On 19/06/2025 15.26, Anthony D'Atri wrote:
Later releases do improve dynamic resharding FWIW.

Also, do you have buckets using versioned objects?  If so I would suggest 
lowering rgw_max_objs_per_shard to, say, 50000. Upcoming releases will improve 
that dynamic.


On Jun 19, 2025, at 9:07 AM, Niklaus Hofer <niklaus.ho...@stepping-stone.ch> 
wrote:

Dear Eugen

Thank you for your input. What you say is true, of course. I've read that some 
place too. I already ran a deep-scrub on all the involved PGs and that's when 
the number of warnings increased drastically. I assume that before I just had 
one positively huge omap object for that pool. Now I have 167 omap objects that 
are not quite as big, but still too large.

Sincerely

Niklaus Hofer

On 19/06/2025 14.48, Eugen Block wrote:
Hi,
the warnings about large omap objects are reported when deep-scrubs happen. So 
if you resharded the bucket (or Ceph did that for you), you'll either have to 
wait for the deep-scrub schedule to scrub the affected PGs, or you issue a 
manual deep-scrub on that PG or the entire pool.
Regards,
Eugen
Zitat von Niklaus Hofer <niklaus.ho...@stepping-stone.ch>:
Dear all

We are running Ceph Pacific (16.2.15) with RadosGW and have been getting "large omap 
objects" health warnings on the RadosGW index pool. Indeed we had one bucket in 
particular that was positively huge with 8'127'198 objects that had just a single shard. 
But we have been seeing the message on some other buckets, too.

Eventually, we activated automatic resharding (rgw_dynamic_resharding = true) 
and indeed this bucket was resharded to now 167 shards. However, I am now 
getting even more large omap object warnings. On that same bucket, too. The 
other buckets have not been resharded at all. They are not in the queue, either:

| radosgw-admin reshard list
[]

| grep 'Large omap object found' /var/log/ceph/ceph* | grep 'PG: ' | cut -d: -f 
10 | cut -d. -f 4-5 | sort | uniq -c
       2 7936686773.215
      12 7937604172.149
      10 7955243979.1209
       9 7955243979.2480
      13 7955243979.2481
      12 7968198782.110
      13 7968913553.67
      11 7968913553.68
      10 7968913553.69
      11 7981210604.1
      74 7981624399.1
     217 7988881492.1


| radosgw-admin metadata list --metadata-key bucket.instance | grep -i 
7988881492
    "<bucket1_name>:<pool_name>.7988881492.1",

| radosgw-admin bucket stats --bucket <bucket1_name>
{
    "bucket": "<bucket1_name>",
    "num_shards": 167,
[...]
    "usage": {
        "rgw.main": {
            "size": 9669928611955,
            "size_actual": 9692804734976,
            "size_utilized": 9669928611955,
            "size_kb": 9443289661,
            "size_kb_actual": 9465629624,
            "size_kb_utilized": 9443289661,
            "num_objects": 8134437
        }
    },

Let's check another one, too:

| radosgw-admin metadata list --metadata-key bucket.instance | grep -i 
7968198782.110
    "<bucket2_name>:<pool_name>.7968198782.110",

| radosgw-admin bucket stats --bucket <bucket2_name>
[...]
            "num_objects": 38690
[...]

According to the documentation in [0], buckets are resharded at a threshold of 
100'000 objects per shard. For both of these, that applies nicely, so it makes 
sense that they are not getting resharded any further.

But why then am I getting these warnings?

Reading the documentation in [1], I can see that warnings are printed at 
200'000 entries per omap object. Can I assume that one object in an RGW bucket 
means 1 entry in an omap object? Or is that a missconception?

Now, here is my working theory. Please let me know if that has any merit or if 
I'm completely off:

The affected buckets have versioning activated. Plus object locking too. They 
get used by a backup software (Kopia) that uses these features to provide 
ransomware protection. So my thinking is that maybe with versioning active, 
each object in a bucket could result in multiple omap entries, maybe one per 
version or something?

If that is the case, then maybe I should reduce `rgw_max_objs_per_shard` from 
100'000 to something like 10'000 to have the buckets resharded more 
aggressively?

But then again, that assumes a lot. For example, that assumes that the 
num_objects counter in the bucket stats does not count up on versioned objects. 
So my assumption could be completely whack.

What do you think? What can I do to get rid of the large omap objects? Is more 
resharding going to help? What else could I check?

Sincerely

Niklaus Hofer

Links:
[0] https://docs.ceph.com/en/latest/radosgw/dynamicresharding/ 
#confval-rgw_max_objs_per_shard
[1] https://docs.ceph.com/en/latest/rados/operations/health-checks/ 
#large-omap-objects
--
stepping stone AG
Wasserwerkgasse 7
CH-3011 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
niklaus.ho...@stepping-stone.ch
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Managed Kubernetes (Container as a Service) kostenlos testen!

Jetzt bestellen und 500 Franken Startguthaben sichern:
https://www.stoney-cloud.com/caas/

Arbeitszeiten: Dienstag bis Freitag

stepping stone AG
Wasserwerkgasse 7
CH-3011 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
niklaus.ho...@stepping-stone.ch
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Managed Kubernetes (Container as a Service) kostenlos testen!

Jetzt bestellen und 500 Franken Startguthaben sichern:
https://www.stoney-cloud.com/caas/

Arbeitszeiten: Dienstag bis Freitag

stepping stone AG
Wasserwerkgasse 7
CH-3011 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
niklaus.ho...@stepping-stone.ch
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to