[ceph-users] Ceph Balancer code

EDH - Manuel Rios Fernandez Sat, 17 Aug 2019 15:07:52 -0700

 

Hi ,


 

Whats the reason for not allow balancer PG if objects are inactive/misplaced
at least in nautilus 14.2.2 ?

 
<https://github.com/ceph/ceph/blob/master/src/pybind/mgr/balancer/module.py#
L874>
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/balancer/module.py#L
874

 


if unknown > 0.0:

        
            detail = 'Some PGs (%f) are unknown; try again later' % unknown

        
            self.log.info(detail)

        
            return -errno.EAGAIN, detail

        
        elif degraded > 0.0:

        
            detail = 'Some objects (%f) are degraded; try again later' %
degraded

        
            self.log.info(detail)

        
            return -errno.EAGAIN, detail

        
        elif inactive > 0.0:

        
            detail = 'Some PGs (%f) are inactive; try again later' %
inactive

        
            self.log.info(detail)

        
            return -errno.EAGAIN, detail

        
        elif misplaced >= max_misplaced:

        
            detail = 'Too many objects (%f > %f) are misplaced; ' \

        
                     'try again later' % (misplaced, max_misplaced)

        
            self.log.info(detail)

        
            return -errno.EAGAIN, detail

 

A lot of time, objects are misplaced and degraded because balancer just run
in healthy periods , but from my point of view , there're states "misplaced"
& degraded where balancer become a must, because finally ceph admin need to
do manually a ceph reweight to do balancer job and allow our cluster to be
healthy for allow balancer start working.

 

We can understood that balancer cant work with unknow pgs states and
inactive states. But. missing and misplaced.

 

Hope some developer can clarify that. This lines cause a lot of problem at
least in nautilus 14.2.2

 

Case example:

*       Pool Size 1, upgraded to Size 2. Cluster become Warning with
misplaced and degraded. Some objects are don't recovery from degraded state
due "OSD backfullfill_toofull "due OSDs became full instead of even
distributed and balanced, because balancer code exclude it.
*       Solution manual reweight. but have not sense

 

 

Regards

Manuel

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph Balancer code

Reply via email to