> -----Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
> Trygve Vea
> Sent: 29 November 2016 14:07
> To: ceph-users <ceph-us...@ceph.com>
> Subject: [ceph-users] Regarding loss of heartbeats
> 
> Since Jewel, we've seen quite a bit of funky behaviour in Ceph.  I've written 
> about it a few times to the mailing list.
> 
> Higher CPU utilization after the upgrade / Loss of heartbeats.  We've looked 
> at our network setup, and we've optimized some
> potential bottlenecks some places.
> 
> Interesting thing regarding loss of heartbeats.  We have observed OSDs 
> running on the same host losing heartbeats against
> eachother.  I'm not sure why they are connected at all (we have had some 
> remapped/degraded placement groups over the weekend,
> maybe that's why) - but I have a hard time pointing the finger at our network 
> when the heartbeat is lost between two osds on the
> same server.
> 
> 
> I've been staring myself blind at this problem for a while, and just now 
> noticed a pretty new bug report that I want to believe is
related
> to what I am experiencing: http://tracker.ceph.com/issues/18042
> 
> We had one OSD hit a suicide timeout value and kill itself off last night, 
> and one can see that several of these heartbeats are
between
> osds on the same node.  (zgrep '10.22.9.21.*10.22.9.21' ceph-osd.2.gz)
> 
> http://employee.tv.situla.bitbit.net/ceph-osd.2.gz
> 
> 
> Does anyone have any thoughts about this?  Are we stumbling on a known, or 
> unknown bug in Ceph?

Hi Trygve,

I was getting similar things to you after upgrading to 10.2.3, definitely 
seeing problems where OSD's on the same nodes were marking
each other out and the cluster was fairly idle. I found that it seemed to being 
caused by Kernel 4.7, nodes in the same cluster that
were on 4.4 were unaffected. After downgrading all nodes to 4.4, everything has 
been really stable for me.

Nick

> 
> 
> Regards
> --
> Trygve Vea
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to