I'm seeing one OSD spamming it's log with
2014-04-02 16:49:21.547339 7f5cc6c5d700 1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7f5cc3456700' had timed out after 15

It starts about 30 seconds after the OSD daemon is started. It continues until 2014-04-02 16:48:57.526925 7f0e5a683700 1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7f0e3c857700' had suicide timed out after 150 2014-04-02 16:48:57.528008 7f0e5a683700 -1 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, time_t)' thread 7f0e5a683700 time 2014-04-02 16:48:57.526948
common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")

I tried bumping up logging, and I don't see anything interesting. I tried strace, and all I can really see is that the OSD spends a lot of time in FUTEX_WAIT.

This OSD has been flapping for several days now. None of the other OSDs are having this issue. I thought it might be similiar to Quenten Grasso's post about 'OSD Restarts cause excessively high load average and "requests are blocked > 32 sec"'. At first it looks similiar, but Quenten said his OSDs eventually settle down. Mine never does.



Can I increase that 15 second timeout, to see if it just needs additional time? I don't see anything in the ceph docs about this.

Otherwise, I'm pretty close to removing the disk, zapping it, and add it back to the cluster. Any other suggestions?

--

*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter <http://www.twitter.com/centraldesktop> | Facebook <http://www.facebook.com/CentralDesktop> | LinkedIn <http://www.linkedin.com/groups?gid=147417> | Blog <http://cdblog.centraldesktop.com/>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to