[ceph-users] Re: check bucket index on large bucket leads to laggy index PGs

Boris Fri, 09 May 2025 10:50:57 -0700

>
> resharding the bucket is indeed the solution. while resharding does
> have to read all of the keys from the source index objects, it doesn't
> read all of them at once. writing these keys to the target bucket
> index objects is the more expensive part, but those are different
> objects/pgs and should be better distributed


Ah great. Will try that.

Ensure the index pool is really on only SSDs.   I’ve seen crush rules not
> specifying device class.
>
Yes they are. On dedicated NVMEs that we use for the meta pools (the
listing I've sent in slack).

Do you have autoresharding disabled?  Versioned objects?  Can you do a
> bilog trim?  Could you preshard a new bucket and move the objects?
>
No autoresharding enabled. We wanted to test sharding in the multisite
setting before we enable it (and reshard all the buckets that need it in a
controlled way).
Yes, the bucket has versioned objects. We are in the process of deleting
it, but the deletion is now running since two days and the bucket index is
still very large.
I can try to do a bilog trim. Need to read up what it does and how to do it.
I could move the data to a new preshareded pool, but it feels like the
bucket is somehow broken, because deleting is now working as I expect it.

I will try to reshard the bucket tonight and hope it will work out. The
explanation from Casey sounds promising.
As I have a lot more buckets with a lot more objects (according to the
bucket index) this needs to be done anyway.

Cheers
 Boris

Am Fr., 9. Mai 2025 um 19:21 Uhr schrieb Anthony D'Atri <
anthony.da...@gmail.com>:

> Ensure the index pool is really on only SSDs.   I’ve seen crush rules not
> specifying device class.
>
> Do you have autoresharding disabled?  Versioned objects?  Can you do a
> bilog trim?  Could you preshard a new bucket and move the objects?
>
> > On May 9, 2025, at 12:54 PM, Boris <b...@kervyn.de> wrote:
> >
> > Hi,
> >
> > I have a bucket that got >20m index entries but only got 11 shards.
> >
> > When I try to run a radosgw-admin bucket check the PGs that hold the
> index start to become laggy after a couple of seconds. I need to stop it
> because it kills the whole object storage.
> >
> > This is a latest reef cluster and the master of a multisite which only
> replicates the metadata (1 realm, multiple zonegroups, one zone per
> zonegroup).
> >
> > Any ideas what I can do?
> > I fear to reshard the bucket, because I am not sure if I can stop the
> resharding if the PGs become laggy.
> >
> > Cheers
> > Boris
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: check bucket index on large bucket leads to laggy index PGs

Reply via email to