2018-02-16 10:16 GMT+01:00 Dan van der Ster <d...@vanderster.com>: > Hi Caspar, > > I've been trying the mgr balancer for a couple weeks now and can share > some experience. > > Currently there are two modes implemented: upmap and crush-compat. > > Upmap requires all clients to be running luminous -- it uses this new > pg-upmap mechanism to precisely move PGs one by one to a more balanced > layout. > The upmap mode is working only with num PGs, AFAICT, and on at least > one of our clusters it happens to be moving PGs in a pool with no data > -- useless. Checking the implementation, it should be upmapping PGs > from a random pool each iteration -- I have a tracker open for this: > http://tracker.ceph.com/issues/22431 > > Upmap is the future, but for now I'm trying to exercise the > crush-compat mode on some larger clusters. It's still early days, but > in general it seems to be working in the right direction. > crush-compat does two things: it creates a new "compat" crush > weight-set to give underutilized OSDs more crush weight; and second, > it phases out the osd reweights back to 1.0. So, if you have a cluster > that was previously balanced with ceph osd reweight-by-*, then > crush-compat will gently bring you to the new balancing strategy. > > There have been a few issues spotted in 12.2.2... some of the balancer > config-key settings aren't cast properly to int/float so they can > break the balancer; and more importantly the mgr doesn't refresh > config-keys if they change. So if you do change the configuration, you > need to ceph mgr fail <theactiveone> to force the next mgr to reload > the config. > > My current config is: > > ceph config-key dump > { > "mgr/balancer/active": "1", > "mgr/balancer/begin_time": "0830", > "mgr/balancer/end_time": "1600", > "mgr/balancer/max_misplaced": "0.01", > "mgr/balancer/mode": "crush-compat" > } > > Note that the begin_time/end_time seem to be in UTC, not the local time > zone. > max_displaced defaults to 0.05, and this is used to limit the > percentage of PGs/objects to be rebalanced each iteration. > > I have it enabled (ceph balancer on) which means it tries to balance > every 60s. It will skip an iteration if num misplaced is greater than > > max_misplaced, or if any objects are degraded. > > When you're first trying the balancer you should do two things to test > a one-off balancing (rather than the always on mode that I use): > - set debug_mgr=4/5 # then you can tail -f ceph-mgr.*.log | grep > balancer to see what it's doing > - ceph balancer mode crush-compat > - ceph balancer eval # to check the current score > - ceph balancer optimize myplan # create but do not execute a new plan > - ceph balancer eval myplan # check what would be the new score > after myplan. Is it getting closer to the optimal value 0? > - ceph balancer show myplan # study what it's trying to do > - ceph balancer execute myplan # execute the plan. data movement starts > here! > - ceph balancer reset # we do this because balancer rm is broken, > and myplan isn't removed automatically after execution > > v12.2.3 has quite a few balancer fixes, and also adds a pool-specific > balancing (which should hopefully fix my upmap issue). > > Hope that helps! > > It sure does Dan! Thank you very much for your detailed answer.
I will start testing the balancer module with our demo cluster. Caspar > Dan > > > > On Fri, Feb 16, 2018 at 9:22 AM, Caspar Smit <caspars...@supernas.eu> > wrote: > > Hi, > > > > After watching Sage's talk at LinuxConfAU about making distributed > storage > > easy he mentioned the Balancer Manager module. After enabling this > module, > > pg's should get balanced automagically around the cluster. > > > > The module was added in Ceph Luminous v12.2.2 > > > > Since i couldn't find much documentation about this module i was > wondering > > if it is considered stable? (production ready) or still experimental/WIP. > > > > Here's the original mailinglist post describing the module: > > > > https://www.spinics.net/lists/ceph-devel/msg37730.html > > > > A few questions: > > > > What are the differences between the different optimization modes? > > Is the balancer run at certain intervals, if yes, what is the interval? > > Will this trigger continuous backfillling/recovering of pg's when a > cluster > > is mostly under write load? > > > > Kind regards, > > Caspar > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com