Re: [ceph-users] Mons deadlocked after they all died

2014-04-30 Thread Marc
On 30/04/2014 00:42, Gregory Farnum wrote: > On Tue, Apr 29, 2014 at 3:28 PM, Marc wrote: >> Thank you for the help so far! I went for option 1 and that did solve >> that problem. However quorum has not been restored. Here's the >> information I can get: >> >> mon a+b are in state Electing and hav

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Gregory Farnum
On Tue, Apr 29, 2014 at 3:28 PM, Marc wrote: > Thank you for the help so far! I went for option 1 and that did solve > that problem. However quorum has not been restored. Here's the > information I can get: > > mon a+b are in state Electing and have been for more than 2 hours now. > mon c does rep

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Marc
Thank you for the help so far! I went for option 1 and that did solve that problem. However quorum has not been restored. Here's the information I can get: mon a+b are in state Electing and have been for more than 2 hours now. mon c does reply to "help" by using the socket, but it does not respond

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Gregory Farnum
On Tue, Apr 29, 2014 at 9:48 AM, Marc wrote: > 'ls' on the respective stores in /var/lib/ceph/mon/ceph.X/store.db > returns a list of files (i.e. still present), fsck seems fine. I did > notice that one of the nodes has different contents in the > /var/lib/ceph/mon/ceph-b/keyring i.e. its key is d

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Marc
'ls' on the respective stores in /var/lib/ceph/mon/ceph.X/store.db returns a list of files (i.e. still present), fsck seems fine. I did notice that one of the nodes has different contents in the /var/lib/ceph/mon/ceph-b/keyring i.e. its key is different from the other 2 nodes'. That shouldn't be th

Re: [ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Gregory Farnum
Monitor keys don't change; I think something else must be going on. Did you remove any of their stores? Are the local filesystems actually correct (fsck)? The ceph-create-keys is a red herring and will stop as soon as. The monitors do get into a quorum. -Greg On Tuesday, April 29, 2014, Marc wro

[ceph-users] Mons deadlocked after they all died

2014-04-29 Thread Marc
Hi, still working on a troubled ceph cluster running .61.2-1raring consisting of (currently) 4 monitors a,b,c,g with g being a newly added monitor that failed/fails to sync up, so consider that one down. Now mon a and b died because for some (currently unknown) reason linux created a core dump on