I can believe the month timeframe for a cluster with multiple large spinners behind each HBA. I’ve witnessed such personally.
> On Jul 20, 2023, at 4:16 PM, Michel Jouvin <michel.jou...@ijclab.in2p3.fr> > wrote: > > Hi Niklas, > > As I said, ceph placement is based on more than fulfilling the failure domain > constraint. This is a core feature in ceph design. There is no reason for a > rebalancing on a cluster with a few hundreds OSDs to last a month. Just > before 17 you have to adjust the max backfills parameter whose default is 1, > a very conservative value. Using 2 should already reduce to rebalancing to a > few days. But my experience shows that if it an option, upgrading to quincy > first may be a better option due to to the autotuning of the number of > backfills based on the real load of the cluster. > > If your cluster is using cephadm, upgrading to quincy is very straightforward > and should be complete I. A couple of hours for the cluster size I mentioned. > > Cheers, > > Michel > Sent from my mobile > Le 20 juillet 2023 20:15:54 Niklas Hambüchen <m...@nh2.me> a écrit : > >> Thank you both Michel and Christian. >> >> Looks like I will have to do the rebalancing eventually. >> From past experience with Ceph 16 the rebalance will likely take at least a >> month with my 500 M objects. >> >> It seems like a good idea to upgrade to Ceph 17 first as Michel suggests. >> >> Unless: >> >> I was hoping that Ceph might have a way to reduce the rebalancing, given >> that all constraints about failure domains are already fulfilled. >> >> In particular, I was wondering whether I could play with the names of the >> "datacenter"s, to bring them in the same (alphabetical?) order as the hosts >> were so far. >> I suspect that this is what avoided the reshuffling on my my mini test >> cluster. >> I think it would be in alignment with Table 1 from the CRUSH paper: >> https://ceph.com/assets/pdfs/weil-crush-sc06.pdf >> >> E.g. perhaps >> >> take(root) >> select(1, row) >> select(3, cabinet) >> emit >> >> yields the same result as >> >> take(root) >> select(3, row) >> select(1, cabinet) >> emit >> >> ? >> >> >> Niklas >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io