On Sat, Jul 19, 2014 at 11:08 AM, Wang Haomai <haomaiw...@gmail.com> wrote:
> Oh, it's our fault.
>
> Public_addr and cluster_addr use the same NIC(eth1). But we found during 
> recovering heartbeat may timeout because of busy traffic. I *misunderstood* 
> the mean of heartbeat and use another NIC(eth0) address for heartbeat to 
> avoid timeout.

Hmm, the only times we've seen heartbeats timeout from sharing the NIC
is if there are other issues on the server (e.g., NIC interrupt
handling is going to a single core and saturating it); if you've seen
this under normal recovery conditions we'd like to gather more
information and figure out what happened!
-Greg

>
> From your points, it's easy to understand. And I see the code 
> comments(src/ceph-osd.cc) claim the usage.
>
> Best Wishes!
>
>> 在 2014年7月20日,1:14,Gregory Farnum <g...@inktank.com> 写道:
>>
>> The heartbeat code is very careful to use the same physical interfaces as
>> 1) the cluster network
>> 2) the public network
>>
>> If the first breaks, the OSD can't talk with its peers. If the second
>> breaks, it can't talk with the monitors or clients. Either way, the
>> OSD can't do its job so it gets marked down.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>>> On Sat, Jul 19, 2014 at 3:08 AM, Haomai Wang <haomaiw...@gmail.com> wrote:
>>> Hi all,
>>>
>>> Our production ceph node each has two NIC, one used by heartbeat
>>> another used by cluster_network.
>>>
>>> By accident, the heartbeat NIC is broken but the cluster_network NIC
>>> is healthy. But osds report the broken NIC node is unavailable, so
>>> monitor decide to kick out the node.
>>>
>>> I'm not sure what I describe match the code logic, if so, is it more
>>> reasonable that ceph-osd process can detect cluster_network is healthy
>>> so we don't kick out the broken node.
>>>
>>> --
>>> Best Regards,
>>>
>>> Wheat
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to