Hi Kamila,

Thank you for your response.

I think we solved it yesterday.
I simply removed the mon again and this time I also removed all references to 
it in ceph.conf (had some remnants there).
After that I ran ceph-deploy and after that it haven’t crashed again so far.

So in this case it was most likely some leftovers from the old mon in the 
config that fscked up things. (don’t get why though, but since it works after I 
removed all traces of it first and then recreated it). (before that I had 
removed it, recreated it a bunch of times aswell, but with some leftovers I 
ceph.conf, that was when it didn’t work)

//Anders

Från: Kamila Součková [mailto:kam...@ksp.sk]
Skickat: den 8 november 2017 13:43
Till: Anders Olausson <and...@spacedump.se>
Kopia: ceph-users@lists.ceph.com
Ämne: Re: [ceph-users] Issue with "renamed" mon, crashing

Hi,

I am not sure if this is the same issue as we had recently, but it looks a bit 
like it -- we also had a Luminous mon crashing right after syncing was done.

Turns out that the current release has a bug which causes the mon to crash if 
it cannot find a mgr daemon. This should be fixed in the upcoming release.

In our case we "solved" it by moving the active mgr to the mon's host. (I am 
not sure how to activate a specific mgr, but it appears that the mgrs get 
activated in FIFO order -- so just keep killing and re-starting the active one 
until a mgr on the mon's host is active).

Hope this helps!

Kamila

On Mon, Nov 6, 2017 at 12:44 PM Anders Olausson 
<and...@spacedump.se<mailto:and...@spacedump.se>> wrote:
Hi,

I recently (yesterday) upgraded to Luminous (12.2.1) running on Ubuntu 14.04.5 
LTS.
Upgrade went fine, no issues at all.
However when I was about to use ceph-deploy to configure some new disks it 
failed.
After some investigation I figured out that it didn’t like that my mons was 
named ceph03mon on the host ceph03 for example, ceph-deploy gatherkeys ceph03 
failed.
So I decided to rename my mons. I started with removing one of them:

# stop ceph-mon id=ceph03mon
# ceph mon remove ceph03mon
# cd /var/lib/ceph/mon/
# mv ceph-ceph03mon disabled-ceph-ceph03mon

Created the new one:

# mkdir tmp
# mkdir ceph-ceph03
# ceph auth get mon. -o tmp/keyring
# ceph mon getmap -o tmp/monmap
# ceph-mon -i ceph03 --mkfs --monmap tmp/monmap --keyring tmp/keyring
# chown -R ceph:ceph ceph-ceph03
# ceph-mon -i ceph03 --public-addr 10.10.1.23:6789<http://10.10.1.23:6789>
# start ceph-mon id=ceph03

Starts OK, quorum is established, when it gets the command “ceph osd pool stat” 
for example, or “ceph auth list” it crashes.

Complete log can be found at: 
http://files.spacedump.se/ceph03-monerror-20171106-01.txt
Used below settings for logging in ceph.conf at the time:

[mon]
       debug mon = 20
       debug paxos = 20
       debug auth = 20

I have now rolled back to the old monitor, it works as it should, on the same 
box etc. But it’s the one upgraded from Hammer -> Jewel -> Luminous.

Any idea what the issue could be?
Thanks.

Best regards
  Anders Olausson
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to