[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Konstantin Shalygin
Yes, try to use Wido's script (remove quorum logic or execute commands by hand) https://gist.github.com/wido/561c69dc2ec3a49d1dba10a59b53dfe5 k > On 10 Sep 2021, at 14:57, mk wrote: > > I have just seen that on failed mon store

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
Redeploying would probably be the fastest way if you don't want your cluster in a degraded state for too long. You can check the logs afterwards to see what went wrong. Zitat von mk : I have just seen that on failed mon store.db size is 50K but on both other healthy mons 151M, What is th

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
I have just seen that on failed mon store.db size is 50K but on both other healthy mons 151M, What is the best practice? redeploy failed mon? > On 10. Sep 2021, at 13:08, Eugen Block wrote: > > Yes, give it a try. If the cluster is healthy otherwise it shouldn't be a > problem. > > > Zitat

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
Is there anything wrong with the directory permissions? What does the mon log tell you? Zitat von mk : no doesn’t start the mon daemon: Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Start request repeated too quickly. Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Fa

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
no doesn’t start the mon daemon: Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Start request repeated too quickly. Sep 10 13:35:55 amon3 systemd[1]: ceph-mon@amon3.service: Failed with result 'exit-code'. Sep 10 13:35:55 amon3 systemd[1]: Failed to start Ceph cluster monitor daemon.

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
Yes, give it a try. If the cluster is healthy otherwise it shouldn't be a problem. Zitat von mk : Thx Eugen, just stopping mon and remove/rename only store.db and start mon? BR Max On 10. Sep 2021, at 12:50, Eugen Block wrote: I don't have an explanation but removing the mon store from th

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread mk
Thx Eugen, just stopping mon and remove/rename only store.db and start mon? BR Max > On 10. Sep 2021, at 12:50, Eugen Block wrote: > > I don't have an explanation but removing the mon store from the failed mon > has resolved similar issues in the past. Could you give that a try? > > > Zitat v

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Eugen Block
I don't have an explanation but removing the mon store from the failed mon has resolved similar issues in the past. Could you give that a try? Zitat von mk : Hi CephFolks, I have a cluster 14.2.21-22/Ubuntu 18.04 with 3 mon’s. After going down/restart of 1 mon(amon3) it stucks on probing