See responses below.

> On Aug 28, 2019, at 11:13 PM, Konstantin Shalygin <k0...@k0ste.ru> wrote:
>> Just a follow up 24h later, and the mgr's seem to be far more stable, and 
>> have had no issues or weirdness after disabling the balancer module.
>> 
>> Which isn't great, because the balancer plays an important role, but after 
>> fighting distribution for a few weeks and getting it 'good enough' I'm 
>> taking the stability.
>> 
>> Just wanted to follow up with another 2ยข.
> What is your balancer settings (`ceph config-key ls`)? Your mgr running in 
> virtual environment or on bare metal?

bare metal
>> $ ceph config-key ls | grep balance
>>     "config/mgr/mgr/balancer/active",
>>     "config/mgr/mgr/balancer/max_misplaced",
>>     "config/mgr/mgr/balancer/mode",
>>     "config/mgr/mgr/balancer/pool_ids",
>>     "mgr/balancer/active",
>>     "mgr/balancer/max_misplaced",
>>     "mgr/balancer/mode",


> How much pools you have? Please also paste `ceph osd tree` & `ceph osd df 
> tree`. 

$ ceph osd pool ls detail
>> pool 16 replicated crush_rule 1 object_hash rjenkins pg_num 4    
>> autoscale_mode warn last_change 157895 lfor 0/157895/157893 flags 
>> hashpspool,nodelete stripe_width 0 application cephfs
>> pool 17 replicated crush_rule 0 object_hash rjenkins pg_num 1024 
>> autoscale_mode warn last_change 174817 flags hashpspool,nodelete 
>> stripe_width 0 compression_algorithm snappy compression_mode aggressive 
>> application cephfs
>> pool 20 replicated crush_rule 2 object_hash rjenkins pg_num 4096 
>> autoscale_mode warn last_change 174817 flags hashpspool,nodelete 
>> stripe_width 0 application freeform
>> pool 24 replicated crush_rule 0 object_hash rjenkins pg_num 16   
>> autoscale_mode warn last_change 174817 lfor 0/157704/157702 flags hashpspool 
>> stripe_width 0 compression_algorithm snappy compression_mode none 
>> application freeform
>> pool 29 replicated crush_rule 2 object_hash rjenkins pg_num 128  
>> autoscale_mode warn last_change 174817 lfor 0/0/142604 flags 
>> hashpspool,selfmanaged_snaps stripe_width 0 application rbd
>> pool 30 replicated crush_rule 0 object_hash rjenkins pg_num 1    
>> autoscale_mode warn last_change 174817 flags hashpspool stripe_width 0 
>> pg_num_min 1 application mgr_devicehealth
>> pool 31 replicated crush_rule 2 object_hash rjenkins pg_num 16   
>> autoscale_mode warn last_change 174926 flags hashpspool,selfmanaged_snaps 
>> stripe_width 0 application rbd

https://pastebin.com/bXPs28h1 <https://pastebin.com/bXPs28h1>Measure time of 
balancer plan creation: `time ceph balancer optimize new`.
> 
I hadn't seen this optimize command yet, I was always doing balancer eval 
$plan, balancer execute $plan.
>> $ time ceph balancer optimize newplan1
>> Error EALREADY: Unable to find further optimization, or pool(s)' pg_num is 
>> decreasing, or distribution is already perfect
>> 
>> real    3m10.627s
>> user    0m0.352s
>> sys     0m0.055s

Reed

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to