Hi, Thanks.

norebalance/nobackfill are useful to pause ongoing backfilling, but
aren't the best option now to get the PGs to go active+clean and let
the mon db come back under control. Unset those before continuing.

I think you need to set the pg_num for pool1 to something close to but
less than 926. (Or whatever the pg_num_target is when you run the
command below).
The idea is to let a few more merges complete successfully but then
once all PGs are active+clean to take a decision about the other
interventions you want to carry out.
So this ought to be good:
    ceph osd pool set pool1 pg_num 920

Then for pool7 this looks like splitting is ongoing. You should be
able to pause that by setting the pg_num to something just above 883.
I would do:
    ceph osd pool set pool7 pg_num 890

It may even be fastest to just set those pg_num values to exactly what
the current pgp_num_target is. You can try it.

Once your cluster is stable again, then you should set those to the
nearest power of two.
Personally I would wait for #53729 to be fixed before embarking on
future pg_num changes.
(You'll have to mute a warning in the meantime -- check the docs after
the warning appears).

Cheers, dan

On Wed, Apr 13, 2022 at 5:16 PM Ray Cunningham
<ray.cunning...@keepertech.com> wrote:
>
> Perfect timing, I was just about to reply. We have disabled autoscaler on all 
> pools now.
>
> Unfortunately, I can't just copy and paste from this system...
>
> `ceph osd pool ls detail` only 2 pools have any difference.
> pool1:  pgnum 940, pgnum target 256, pgpnum 926 pgpnum target 256
> pool7:  pgnum 2048, pgnum target 2048, pgpnum883, pgpnum target 2048
>
> ` ceph osd pool autoscale-status`
> Size is defined
> target size is empty
> Rate is 7 for all pools except pool7, which is 1.3333333730697632
> Raw capacity is defined
> Ratio for pool1 is .0177, pool7 is .4200 and all others is 0
> Target and Effective Ratio is empty
> Bias is 1.0 for all
> PG_NUM: pool1 is 256, pool7 is 2048 and all others are 32.
> New PG_NUM is empty
> Autoscale is now off for all
> Profile is scale-up
>
>
> We have set norebalance and nobackfill and are watching to see what happens.
>
> Thank you,
> Ray
>
> -----Original Message-----
> From: Dan van der Ster <dvand...@gmail.com>
> Sent: Wednesday, April 13, 2022 10:00 AM
> To: Ray Cunningham <ray.cunning...@keepertech.com>
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Stop Rebalancing
>
> One more thing, could you please also share the `ceph osd pool 
> autoscale-status` ?
>
>
> On Tue, Apr 12, 2022 at 9:50 PM Ray Cunningham 
> <ray.cunning...@keepertech.com> wrote:
> >
> > Thank you Dan! I will definitely disable autoscaler on the rest of our 
> > pools. I can't get the PG numbers today, but I will try to get them 
> > tomorrow. We definitely want to get this under control.
> >
> > Thank you,
> > Ray
> >
> >
> > -----Original Message-----
> > From: Dan van der Ster <dvand...@gmail.com>
> > Sent: Tuesday, April 12, 2022 2:46 PM
> > To: Ray Cunningham <ray.cunning...@keepertech.com>
> > Cc: ceph-users@ceph.io
> > Subject: Re: [ceph-users] Stop Rebalancing
> >
> > Hi Ray,
> >
> > Disabling the autoscaler on all pools is probably a good idea. At least 
> > until https://tracker.ceph.com/issues/53729 is fixed. (You are likely not 
> > susceptible to that -- but better safe than sorry).
> >
> > To pause the ongoing PG merges, you can indeed set the pg_num to the 
> > current value. This will allow the ongoing merge complete and prevent 
> > further merges from starting.
> > From `ceph osd pool ls detail` you'll see pg_num, pgp_num, pg_num_target, 
> > pgp_num_target... If you share the current values of those we can help 
> > advise what you need to set the pg_num to to effectively pause things where 
> > they are.
> >
> > BTW -- I'm going to create a request in the tracker that we improve the pg 
> > autoscaler heuristic. IMHO the autoscaler should estimate the time to carry 
> > out a split/merge operation and avoid taking one-way decisions without 
> > permission from the administrator. The autoscaler is meant to be helpful, 
> > not degrade a cluster for 100 days!
> >
> > Cheers, Dan
> >
> >
> >
> > On Tue, Apr 12, 2022 at 9:04 PM Ray Cunningham 
> > <ray.cunning...@keepertech.com> wrote:
> > >
> > > Hi Everyone,
> > >
> > > We just upgraded our 640 OSD cluster to Ceph 16.2.7 and the resulting 
> > > rebalancing of misplaced objects is overwhelming the cluster and 
> > > impacting MON DB compaction, deep scrub repairs and us upgrading legacy 
> > > bluestore OSDs. We have to pause the rebalancing if misplaced objects or 
> > > we're going to fall over.
> > >
> > > Autoscaler-status tells us that we are reducing our PGs by 700'ish which 
> > > will take us over 100 days to complete at our current recovery speed. We 
> > > disabled autoscaler on our biggest pool, but I'm concerned that it's 
> > > already on the path to the lower PG count and won't stop adding to our 
> > > misplaced count after drop below 5%. What can we do to stop the cluster 
> > > from finding more misplaced objects to rebalance? Should we set the PG 
> > > num manually to what our current count is? Or will that cause even more 
> > > havoc?
> > >
> > > Any other thoughts or ideas? My goals are to stop the rebalancing 
> > > temporarily so we can deep scrub and repair inconsistencies, upgrade 
> > > legacy bluestore OSDs and compact our MON DBs (supposedly MON DBs don't 
> > > compact when you aren't 100% active+clean).
> > >
> > > Thank you,
> > > Ray
> > >
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > > email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to