Hi Francois,

What is the output of `ceph balancer status` ?
Also, can you increase the debug_mgr to 4/5 then share the log file of
the active mgr?
Best,

Dan

On Fri, Jan 29, 2021 at 10:54 AM Francois Legrand <f...@lpnhe.in2p3.fr> wrote:
>
> Thanks for your suggestion. I will have a look !
>
> But I am a bit surprised that the "official" balancer seems so unefficient !
>
> F.
>
> Le 28/01/2021 à 12:00, Jonas Jelten a écrit :
> > Hi!
> >
> > We also suffer heavily from this so I wrote a custom balancer which yields 
> > much better results:
> > https://github.com/TheJJ/ceph-balancer
> >
> > After you run it, it echoes the PG movements it suggests. You can then just 
> > run those commands the cluster will balance more.
> > It's kinda work in progress, so I'm glad about your feedback.
> >
> > Maybe it helps you :)
> >
> > -- Jonas
> >
> > On 27/01/2021 17.15, Francois Legrand wrote:
> >> Hi all,
> >> I have a cluster with 116 disks (24 new disks of 16TB added in december 
> >> and the rest of 8TB) running nautilus 14.2.16.
> >> I moved (8 month ago) from crush_compat to upmap balancing.
> >> But the cluster seems not well balanced, with a number of pgs on the 8TB 
> >> disks varying from 26 to 52 ! And an occupation from 35 to 69%.
> >> The recent 16 TB disks are more homogeneous with 48 to 61 pgs and space 
> >> between 30 and 43%.
> >> Last week, I realized that some osd were maybe not using upmap because I 
> >> did a ceph osd crush weight-set ls and got (compat) as result.
> >> Thus I ran a ceph osd crush weight-set rm-compat which triggered some 
> >> rebalancing. Now there is no more recovery for 2 days, but the cluster is 
> >> still unbalanced.
> >> As far as I understand, upmap is supposed to reach an equal number of pgs 
> >> on all the disks (I guess weighted by their capacity).
> >> Thus I would expect more or less 30 pgs on the 8TB disks and 60 on the 
> >> 16TB and around 50% usage on all. Which is not the case (by far).
> >> The problem is that it impact the free available space in the pools (264Ti 
> >> while there is more than 578Ti free in the cluster) because free space 
> >> seems to be based on space available before the first osd will be full !
> >> Is it normal ? Did I missed something ? What could I do ?
> >>
> >> F.
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to