Very solid advice here - that’s the beauty of Ceph community. Just adding to what Anthony mentioned: a reweight from 1 to 0.2 (and back) is quite extreme and the cluster won’t like it. We never go above increments/decrements of 0.02-0.04. If you have to go from 1 to 0.98 and then to 0.96 and so on leaving enough time for the cluster to settle in between. How did backfilling look when changing? Did you see a decrease in backfills after reverting back to 1?
From the clients perspective Your main concern now is to keep the pools “alive” with enough space while the backfilling takes place. Even with plenty of OSDs that are not filled you might hit a single overfilled OSD and the whole pool will stop accepting new data. Clients will start getting “No more space available” errors. That happened to us with CephFS recently with a very similar scenario where the cluster got much more data than expected in a short amount of time, not fun. With the balancer not working due to too many misplaced objects that’s an increased risk so just heads up and keep that in mind. To get things working we simply balanced manually the OSDs with upmaps moving data from the most full ones to the least full ones (our builtin balancer sadly does not work). One small observation: I’ve noticed that 'ceph osd pool ls detail |grep cephfs.cephfs01.data’ has pg_num increased but the pgp_num is still the same. You will need to set it as well for data migration to new pgs to happen: https://docs.ceph.com/en/mimic/rados/operations/placement-groups/#set-the-number-of-placement-groups Best, Laimis J. > On 5 Jan 2025, at 16:11, Anthony D'Atri <anthony.da...@gmail.com> wrote: > > >>> What reweighs have been set for the top OSDs (ceph osd df tree)? >>> >> Right now they are all at 1.0. I had to lower them to something close to >> 0.2 in order to free up space but I changed them back to 1.0. Should I >> lower them while the backfill is happening? > > Old-style legacy override reweights don’t mesh well with the balancer. Best > to leave them at 1.00. > > 0.2 is pretty extreme, back in the day I rarely went below 0.8. > >>> ``` >>> "optimize_result": "Too many objects (0.355160 > 0.050000) are misplaced; >>> try again late >>> ``` > > That should clear. The balancer doesn’t want to stir up trouble if the > cluster already has a bunch of backfill / recovery going on. Patience! > >>> default.rgw.buckets.data 10 1024 197 TiB 133.75M 592 TiB 93.69 >>> 13 TiB >>> default.rgw.buckets.non-ec 11 32 78 MiB 1.43M 17 GiB > > That’s odd that the data pool is that full but the others aren’t. > > Please send `ceph osd crush rule dump `. And `ceph osd dump | grep pool` > > >>> >>> I also tried changing the following but it does not seem to persist: > > Could be an mclock thing. > >>> 1. Why I ended up with so many misplaced PG's since there were no changes >>> on the cluster: number of osd's, hosts, etc. > > Probably a result of the autoscaler splitting PGs or of some change to CRUSH > rules such that some data can’t be placed. > >>> 2. Is it ok to change the target_max_misplaced_ratio to something higher >>> than .05 so the autobalancer would work and I wouldn't have to constantly >>> rebalance the osd's manually? > > I wouldn’t, that’s a symptom not the disease. >>> Bruno >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> >>> >>> >>> >> >> -- >> Bruno Gomes Pessanha >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io