Hi everyone. Yes. All the tips definitely helped! Now I have more free space in the pools, the number of misplaced PG's decreased a lot and lower std deviation of the usage of OSD's. The storage looks way healthier now. Thanks a bunch!
I'm only confused by the number of misplaced PG's which never goes below 5%. Every time it hits 5% it goes up and down like shown in this quite interesting graph: [image: image.png] Any idea why that might be? I had the impression that it might be related to the autobalancer that kicks in and pg's are misplaced again. Or am I missing something? Bruno On Mon, 6 Jan 2025 at 16:00, Bruno Gomes Pessanha <bruno.pessa...@gmail.com> wrote: > So you might set the full ratio to .98, backfillfull to .96. Nearfull is >> only cosmetic. > > Thanks for the advice. It seems to be working with 0.92 for now. If it > gets stuck I'll increase it. > > On Mon, 6 Jan 2025 at 00:24, Anthony D'Atri <anthony.da...@gmail.com> > wrote: > >> >> >> Very solid advice here - that’s the beauty of Ceph community. >> >> Just adding to what Anthony mentioned: a reweight from 1 to 0.2 (and >> back) is quite extreme and the cluster won’t like it. >> >> >> And these days with the balancer, pg-upmap entries to the same effect are >> a better idea. >> >> From the clients perspective Your main concern now is to keep the pools >> “alive” with enough space while the backfilling takes place. >> >> >> To that end, you can *temporarily* give yourself a bit more margin: >> >> ceph osd set-nearfull-ratio .85 >> ceph osd set-backfillfull-ratio .90 >> ceph osd set-full-ratio .95 >> >> Those are the default values, and Ceph (now) enforces that the values are >> >= (or maybe >) in that order. >> >> So you might set the full ratio to .98, backfillfull to .96. Nearfull is >> only cosmetic. >> >> But absolutely do not forget to revert to default values once the cluster >> is balanced, or to other values that you make an educated decision to >> choose. >> >> Even with plenty of OSDs that are not filled you might hit a single >> overfilled OSD and the whole pool will stop accepting new data. >> >> >> Yep, see above. Not immediately clear to me why that data pool is so >> full unless the CRUSH rule / device classes are wonky. >> >> Clients will start getting “No more space available” errors. That >> happened to us with CephFS recently with a very similar scenario where the >> cluster got much more data than expected in a short amount of time, not >> fun. >> With the balancer not working due to too many misplaced objects that’s an >> increased risk so just heads up and keep that in mind. To get things >> working we simply balanced manually the OSDs with upmaps moving data from >> the most full ones to the least full ones (our builtin balancer sadly >> does not work). >> >> >> One small observation: >> I’ve noticed that 'ceph osd pool ls detail |grep cephfs.cephfs01.data’ >> has pg_num increased but the pgp_num is still the same. >> You will need to set it as well for data migration to new pgs to happen: >> https://docs.ceph.com/en/mimic/rados/operations/placement-groups/#set-the-number-of-placement-groups >> >> >> The mgr usually does that for recent Ceph releases. With older releases >> we had to incremental pg_num and pgp_num in lockstep, which was kind of a >> pain. >> >> >> >> Best, >> >> *Laimis J.* >> >> On 5 Jan 2025, at 16:11, Anthony D'Atri <anthony.da...@gmail.com> wrote: >> >> >> What reweighs have been set for the top OSDs (ceph osd df tree)? >> >> Right now they are all at 1.0. I had to lower them to something close to >> 0.2 in order to free up space but I changed them back to 1.0. Should I >> lower them while the backfill is happening? >> >> >> Old-style legacy override reweights don’t mesh well with the balancer. >> Best to leave them at 1.00. >> >> 0.2 is pretty extreme, back in the day I rarely went below 0.8. >> >> ``` >> "optimize_result": "Too many objects (0.355160 > 0.050000) are misplaced; >> try again late >> ``` >> >> >> That should clear. The balancer doesn’t want to stir up trouble if the >> cluster already has a bunch of backfill / recovery going on. Patience! >> >> default.rgw.buckets.data 10 1024 197 TiB 133.75M 592 TiB 93.69 >> 13 TiB >> default.rgw.buckets.non-ec 11 32 78 MiB 1.43M 17 GiB >> >> >> That’s odd that the data pool is that full but the others aren’t. >> >> Please send `ceph osd crush rule dump `. And `ceph osd dump | grep pool` >> >> >> >> I also tried changing the following but it does not seem to persist: >> >> >> Could be an mclock thing. >> >> 1. Why I ended up with so many misplaced PG's since there were no changes >> on the cluster: number of osd's, hosts, etc. >> >> >> Probably a result of the autoscaler splitting PGs or of some change to >> CRUSH rules such that some data can’t be placed. >> >> 2. Is it ok to change the target_max_misplaced_ratio to something higher >> than .05 so the autobalancer would work and I wouldn't have to constantly >> rebalance the osd's manually? >> >> >> I wouldn’t, that’s a symptom not the disease. >> >> Bruno >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> >> >> >> >> >> -- >> Bruno Gomes Pessanha >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> >> >> >> > > -- > Bruno Gomes Pessanha > -- Bruno Gomes Pessanha
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io