See responses below. > On Aug 28, 2019, at 11:13 PM, Konstantin Shalygin <k0...@k0ste.ru> wrote: >> Just a follow up 24h later, and the mgr's seem to be far more stable, and >> have had no issues or weirdness after disabling the balancer module. >> >> Which isn't great, because the balancer plays an important role, but after >> fighting distribution for a few weeks and getting it 'good enough' I'm >> taking the stability. >> >> Just wanted to follow up with another 2ยข. > What is your balancer settings (`ceph config-key ls`)? Your mgr running in > virtual environment or on bare metal?
bare metal >> $ ceph config-key ls | grep balance >> "config/mgr/mgr/balancer/active", >> "config/mgr/mgr/balancer/max_misplaced", >> "config/mgr/mgr/balancer/mode", >> "config/mgr/mgr/balancer/pool_ids", >> "mgr/balancer/active", >> "mgr/balancer/max_misplaced", >> "mgr/balancer/mode", > How much pools you have? Please also paste `ceph osd tree` & `ceph osd df > tree`. $ ceph osd pool ls detail >> pool 16 replicated crush_rule 1 object_hash rjenkins pg_num 4 >> autoscale_mode warn last_change 157895 lfor 0/157895/157893 flags >> hashpspool,nodelete stripe_width 0 application cephfs >> pool 17 replicated crush_rule 0 object_hash rjenkins pg_num 1024 >> autoscale_mode warn last_change 174817 flags hashpspool,nodelete >> stripe_width 0 compression_algorithm snappy compression_mode aggressive >> application cephfs >> pool 20 replicated crush_rule 2 object_hash rjenkins pg_num 4096 >> autoscale_mode warn last_change 174817 flags hashpspool,nodelete >> stripe_width 0 application freeform >> pool 24 replicated crush_rule 0 object_hash rjenkins pg_num 16 >> autoscale_mode warn last_change 174817 lfor 0/157704/157702 flags hashpspool >> stripe_width 0 compression_algorithm snappy compression_mode none >> application freeform >> pool 29 replicated crush_rule 2 object_hash rjenkins pg_num 128 >> autoscale_mode warn last_change 174817 lfor 0/0/142604 flags >> hashpspool,selfmanaged_snaps stripe_width 0 application rbd >> pool 30 replicated crush_rule 0 object_hash rjenkins pg_num 1 >> autoscale_mode warn last_change 174817 flags hashpspool stripe_width 0 >> pg_num_min 1 application mgr_devicehealth >> pool 31 replicated crush_rule 2 object_hash rjenkins pg_num 16 >> autoscale_mode warn last_change 174926 flags hashpspool,selfmanaged_snaps >> stripe_width 0 application rbd https://pastebin.com/bXPs28h1 <https://pastebin.com/bXPs28h1>Measure time of balancer plan creation: `time ceph balancer optimize new`. > I hadn't seen this optimize command yet, I was always doing balancer eval $plan, balancer execute $plan. >> $ time ceph balancer optimize newplan1 >> Error EALREADY: Unable to find further optimization, or pool(s)' pg_num is >> decreasing, or distribution is already perfect >> >> real 3m10.627s >> user 0m0.352s >> sys 0m0.055s Reed
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com