Yeah, this can happen during deep_scrub and also during rebalancing..I forgot 
to mention that..
Generally, it is a good idea to throttle those..For deep scrub, you can try 
using (got it from old post, I never used it)

osd_scrub_chunk_min = 1

osd_scrub_chunk_max = 1

osd_scrub_sleep = 0.1

For rebalancing I think you are already using proper value..

But, I don't think this will eliminate the scenario all together but should 
alleviate it a bit.

Also, why you are using so many shards ? How many OSDs you are running in a box 
? shard 25 should be good if you are running with single OSD, IF you have lot 
of OSDs in a box, try to reduce it ~5 or so.

Thanks & Regards


From: Tuomas Juntunen []
Sent: Wednesday, July 01, 2015 8:18 PM
To: Somnath Roy; 'ceph-users'
Subject: RE: [ceph-users] One of our nodes has logs saying: wrongly marked me 

I've checked the network, we use IPoIB and all nodes are connected to the same 
switch, there are no breaks in connectivity while this happens. My constant 
ping says 0.03 - 0.1ms. I would say this is ok.

This happens almost every time when deep scrubbing is running. Our loads on 
this particular server goes to 300+ and osd's are marked down.

Any suggestions on settings? I now have the following settings that might 
affect this

                             osd_op_threads = 6
                             osd_op_num_threads_per_shard = 1
                             osd_op_num_shards = 25
                             #osd_op_num_sharded_pool_threads = 25
                             filestore_op_threads = 6
                             ms_nocrc = true
                             filestore_fd_cache_size = 64
                             filestore_fd_cache_shards = 32
                             ms_dispatch_throttle_bytes = 0
                             throttler_perf_counter = false

                             osd scrub load threshold = 0.1
                             osd max backfills = 1
                             osd recovery max active = 1
                             osd scrub sleep = .1
                             osd disk thread ioprio class = idle
                             osd disk thread ioprio priority = 7
                             osd scrub chunk max = 5
                             osd deep scrub stride = 1048576
                             filestore queue max ops = 10000
                             filestore max sync interval = 30
                             filestore min sync interval = 29
                             osd_client_message_size_cap = 0
                             osd_client_message_cap = 0
                             osd_enable_op_tracker = false

Br, T

From: Somnath Roy []
Sent: 2. heinäkuuta 2015 0:30
To: Tuomas Juntunen; 'ceph-users'
Subject: RE: [ceph-users] One of our nodes has logs saying: wrongly marked me 

This can happen if your OSDs are flapping.. Hope your network is stable.

Thanks & Regards

From: ceph-users [] On Behalf Of Tuomas 
Sent: Wednesday, July 01, 2015 2:24 PM
To: 'ceph-users'
Subject: [ceph-users] One of our nodes has logs saying: wrongly marked me down


One our nodes has OSD logs that say "wrongly marked me down" for every OSD at 
some point. What could be the reason for this. Anyone have any similar 

Other nodes work totally fine and they are all identical.



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
ceph-users mailing list

Reply via email to