[ceph-users] OSD fails to start after power failure

David Young Sat, 14 Jul 2018 14:56:16 -0700

Hey folks,

I have a Luminous 12.2.6 cluster which suffered a power failurerecently. On recovery, one of my OSDs is continually crashing andrestarting, with the error below:


----
9ae00 con 0
    -3> 2018-07-15 09:50:58.313242 7f131c5a9700 10 monclient: tick

-2> 2018-07-15 09:50:58.313277 7f131c5a9700 10 monclient:_check_auth_rotating have uptodate secrets (they expire after 2018-07-1509:50:28.313274) -1> 2018-07-15 09:50:58.313320 7f131c5a9700 10 log_client log_queue is 8 last_log 10 sent 0 num 8 unsent 10 sending 10 0> 2018-07-15 09:50:58.320255 7f131c5a9700 -1/build/ceph-12.2.6/src/common/LogClient.cc: In function 'Message*LogClient::_get_mon_log_message()' thread 7f131c5a9700 time 2018-07-1509:50:58.313336/build/ceph-12.2.6/src/common/LogClient.cc: 294: FAILEDassert(num_unsent <= log_queue.size())

----

I've found a few recent references to this "FAILED assert" message(assuming that's the cause of the problem), such ashttps://bugzilla.redhat.com/show_bug.cgi?id=1599718 andhttp://tracker.ceph.com/issues/18209, with the most recent occurancebeing 3 days ago (http://tracker.ceph.com/issues/18209#note-12).


Is there any resolution to this issue, or anything I can attempt to recover?

Thanks!
D

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSD fails to start after power failure

Reply via email to