I can believe the month timeframe for a cluster with multiple large spinners 
behind each HBA.  I’ve witnessed such personally.

> On Jul 20, 2023, at 4:16 PM, Michel Jouvin <michel.jou...@ijclab.in2p3.fr> 
> wrote:
> 
> Hi Niklas,
> 
> As I said, ceph placement is based on more than fulfilling the failure domain 
> constraint. This is a core feature in ceph design. There is no reason for a 
> rebalancing on a cluster with a few hundreds OSDs to last a month. Just 
> before 17 you have to adjust the max backfills parameter whose default is 1, 
> a very conservative value. Using 2 should already reduce to rebalancing to a 
> few days. But my experience shows that if it an option, upgrading to quincy 
> first may be a better option due to to the autotuning of the number of 
> backfills based on the real load of the cluster.
> 
> If your cluster is using cephadm, upgrading to quincy is very straightforward 
> and should be complete I. A couple of hours for the cluster size I mentioned.
> 
> Cheers,
> 
> Michel
> Sent from my mobile
> Le 20 juillet 2023 20:15:54 Niklas Hambüchen <m...@nh2.me> a écrit :
> 
>> Thank you both Michel and Christian.
>> 
>> Looks like I will have to do the rebalancing eventually.
>> From past experience with Ceph 16 the rebalance will likely take at least a 
>> month with my 500 M objects.
>> 
>> It seems like a good idea to upgrade to Ceph 17 first as Michel suggests.
>> 
>> Unless:
>> 
>> I was hoping that Ceph might have a way to reduce the rebalancing, given 
>> that all constraints about failure domains are already fulfilled.
>> 
>> In particular, I was wondering whether I could play with the names of the 
>> "datacenter"s, to bring them in the same (alphabetical?) order as the hosts 
>> were so far.
>> I suspect that this is what avoided the reshuffling on my my mini test 
>> cluster.
>> I think it would be in alignment with Table 1 from the CRUSH paper: 
>> https://ceph.com/assets/pdfs/weil-crush-sc06.pdf
>> 
>> E.g. perhaps
>> 
>> take(root)
>> select(1, row)
>> select(3, cabinet)
>> emit
>> 
>> yields the same result as
>> 
>> take(root)
>> select(3, row)
>> select(1, cabinet)
>> emit
>> 
>> ?
>> 
>> 
>> Niklas
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to