On 06/04/2019 07:01 PM, Jianyu Li wrote:
> Hello,
> 
> I have a ceph cluster running over 2 years and the monitor began crash
> since yesterday. I had some flapping OSDs up and down occasionally,
> sometimes I need to rebuild the OSD. I found 3 OSDs are down yesterday,
> they may cause this issue or may not. 
> 
> Ceph Version: 12.2.12, ( upgraded from 12.2.8 not fix the issue)
> I have 5 mon nodes, when I start mon service on the first 2 nodes, they
> are good. Once I start the service on the third node, All 3 nodes begin
> keeping up/down(flapping) due to Aborted in
> OSDMonitor::build_incremental. I also tried to recover monitor from 1
> node(remove other 4 nodes) by injecting monmap, the node keep crash as
> well. 

Please increase debug levels to 'debug_mon = 10', 'debug_paxos = 10',
and send us the log once you have your next crash.

This may be a few things, but I'm guessing your other monitors have a
corrupted store somehow. Were there any hardware failures recently
before the crashes started happening?

  -Joao


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to