[ceph-users] Re: Ceph fully crash and we unable to recovery

Parker Lau Tue, 24 Mar 2020 21:20:35 -0700

Hello Sir/Madam,

We are facing the serious problem for our proxmox with Ceph. I have already 
submitted the ticket to Proxmox but they said that only option trying to 
recover the mondb.We would like to know any some suggestion in our situation.


So far the only option that I see would be in trying to recover the mondb from 
an OSD. But this action is usually a last resort. Since I don't know the 
outcome, the cluster could then very well be dead and thus all data lost.
https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

---

> > I would like to set the nodown on the cluster to see if the OSDs are kept 
> > in the cluster.
> > The OSDs are joining the cluster but are set as down shortly after.
> >
> ok . Please go ahead.
Sadly this didn't have any effect either.

But I think I found a clue to what might be going on.
# ceph-osd.0.log
2020-03-24 21:22:06.462100 7fb33aab0e00 10 osd.0 0 read_superblock 
sb(e8e81549-91e5-4370-b091-9500f406a2b2 osd.0 
0bb2b9bb-9a70-4d6f-8d4e-3fc5049d63d6 e14334 [13578,14334] lci=[0,14334])

# ceph-mon.cccs01.log
2020-03-24 21:26:48.038345 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 e14299: 48 
total, 13 up, 35 in
2020-03-24 21:26:48.038351 7f7ef791a700  5 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 can_mark_out 
current in_ratio 0.729167 < min 0.75, will not mark osds out
2020-03-24 21:26:48.038360 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 tick NOOUT 
flag set, not checking down osds
2020-03-24 21:26:48.038364 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299  
min_last_epoch_clean 0

# ceph-mon.cccs06.log
2020-03-22 22:26:57.056939 7f3c1993a700  1 
mon.cccs06@5(peon).osd<mailto:mon.cccs06@5(peon).osd> e14333 e14333: 48 total, 
48 up, 48 in
2020-03-22 22:27:04.113054 7f3c1993a700  0 
mon.cccs06@5(peon)<mailto:mon.cccs06@5(peon)> e31 handle_command 
mon_command({"prefix":"df","format":"json"} v 0) v1
2020-03-22 22:27:04.113086 7f3c1993a700  0 log_channel(audit) log [DBG] : 
from='client.? 10.1.14.8:0/4265796352' entity='client.admin' 
cmd=[{"prefix":"df","format":"json"}]: dispatch
2020-03-22 22:27:09.752027 7f3c1993a700  1 
mon.cccs06@5(peon).osd<mailto:mon.cccs06@5(peon).osd> e14334 e14334: 48 total, 
48 up, 48 in
...
2020-03-23 10:42:51.891722 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
2020-03-23 10:42:51.891729 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
2020-03-23 10:42:51.891730 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 1009089991638532096, adjusting msgr requires
2020-03-23 10:42:51.891732 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
It seems that the OSD have a epoch of e14334 but the MONs seems to have e14269 
for the OSDs. I could only find the e14334 on the ceph-mon.cccs06.
The cccs06 has been the last MON standing (the node also reset). But when the 
cluster came back the MONs with the old epoch came up first and joined.

The syslog showed that these were the last log entries written. So the nodes 
reset shortly after. This would fit to cccs06 being the last MON alive.
# cccs01
Mar 22 22:16:09 cccs01 pmxcfs[2502]: [dcdb] notice: leader is 1/2502
Mar 22 22:16:09 cccs01 pmxcfs[2502]: [dcdb] notice: synced members: 1/2502, 
5/2219

# cccs02
Mar 22 22:15:57 cccs02 pmxcfs[2514]: [dcdb] notice: we (3/2514) left the 
process group
Mar 22 22:15:57 cccs02 pmxcfs[2514]: [dcdb] crit: leaving CPG group

# cccs06
Mar 22 22:31:16 cccs06 pmxcfs[2662]: [status] noMar 22 22:34:01 cccs06 
systemd-modules-load[773]: Inserted module 'iscsi_tcp'
Mar 22 22:34:01 cccs06 systemd-modules-load[773]: Inserted module 'ib_iser'

There must have been some issue prior to the reset as I found those error 
messages. They could explain why no older epoch was written anymore.
# cccs01
2020-03-22 22:22:15.661060 7fc09c85c100 -1 rocksdb: IO error: 
/var/lib/ceph/mon/ceph-cccs01/store.db/LOCK: Permission denied
2020-03-22 22:22:15.661067 7fc09c85c100 -1 error opening mon data directory at 
'/var/lib/ceph/mon/ceph-cccs01': (22) Invalid argument

# cccs02
2020-03-22 22:31:11.209524 7fd034786100 -1 rocksdb: IO error: 
/var/lib/ceph/mon/ceph-cccs02/store.db/LOCK: Permission denied
2020-03-22 22:31:11.209541 7fd034786100 -1 error opening mon data directory at 
'/var/lib/ceph/mon/ceph-cccs02': (22) Invalid argument

# cccs06
no such entries.

From my point of view, the last question is now: How to get the epoch from the 
OSD into the MON DB.
I have no answer to this yet.


Best Regards,
Parker Lau
ReadySpace Ltd - Cloud and Managed Hosting Professionals
Direct: +852 3726 1120
Hotiline: +852 3568 3372<tel:+852%203568%203372>
Fax: +852 3568 3376<tel:+852%203568%203376>
Website: www.readyspace.com.hk<http://www.readyspace.com.hk/>
Helpdesk: helpdesk.readyspace.com<http://helpdesk.readyspace.com/>
Get the latest update and promotion here:
twitter.com/readyspace<http://twitter.com/readyspace> | 
facebook.com/readyspace<http://www.facebook.com/readyspace>
[cid:image001.jpg@01D5CC61.CF90E050]Please consider the environment before 
printing this email.
Information in this message is confidential. It is intended solely for the 
person or the entity to whom it is addressed. If you are not the intended 
recipient, you are not to disseminate, distribute or copy this communication. 
Please notify the sender and delete the message and any other record of it from 
your system immediately.

From: Parker Lau
Sent: Wednesday, March 25, 2020 1:00 AM
To: ceph-users@ceph.io
Subject: 回覆: Ceph fully crash and we unable to recovery

Hello Sir/Madam,

We are facing the serious problem for our proxmox with Ceph. I have already 
submitted the ticket to Proxmox but they said that only option trying to 
recover the mondb.We would like to know any some suggestion in our situation.

So far the only option that I see would be in trying to recover the mondb from 
an OSD. But this action is usually a last resort. Since I don't know the 
outcome, the cluster could then very well be dead and thus all data lost.
https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

---

> > I would like to set the nodown on the cluster to see if the OSDs are kept 
> > in the cluster.
> > The OSDs are joining the cluster but are set as down shortly after.
> >
> ok . Please go ahead.
Sadly this didn't have any effect either.

But I think I found a clue to what might be going on.
# ceph-osd.0.log
2020-03-24 21:22:06.462100 7fb33aab0e00 10 osd.0 0 read_superblock 
sb(e8e81549-91e5-4370-b091-9500f406a2b2 osd.0 
0bb2b9bb-9a70-4d6f-8d4e-3fc5049d63d6 e14334 [13578,14334] lci=[0,14334])

# ceph-mon.cccs01.log
2020-03-24 21:26:48.038345 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 e14299: 48 
total, 13 up, 35 in
2020-03-24 21:26:48.038351 7f7ef791a700  5 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 can_mark_out 
current in_ratio 0.729167 < min 0.75, will not mark osds out
2020-03-24 21:26:48.038360 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 tick NOOUT 
flag set, not checking down osds
2020-03-24 21:26:48.038364 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299  
min_last_epoch_clean 0

# ceph-mon.cccs06.log
2020-03-22 22:26:57.056939 7f3c1993a700  1 
mon.cccs06@5(peon).osd<mailto:mon.cccs06@5(peon).osd> e14333 e14333: 48 total, 
48 up, 48 in
2020-03-22 22:27:04.113054 7f3c1993a700  0 
mon.cccs06@5(peon)<mailto:mon.cccs06@5(peon)> e31 handle_command 
mon_command({"prefix":"df","format":"json"} v 0) v1
2020-03-22 22:27:04.113086 7f3c1993a700  0 log_channel(audit) log [DBG] : 
from='client.? 10.1.14.8:0/4265796352' entity='client.admin' 
cmd=[{"prefix":"df","format":"json"}]: dispatch
2020-03-22 22:27:09.752027 7f3c1993a700  1 
mon.cccs06@5(peon).osd<mailto:mon.cccs06@5(peon).osd> e14334 e14334: 48 total, 
48 up, 48 in
...
2020-03-23 10:42:51.891722 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
2020-03-23 10:42:51.891729 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
2020-03-23 10:42:51.891730 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 1009089991638532096, adjusting msgr requires
2020-03-23 10:42:51.891732 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
It seems that the OSD have a epoch of e14334 but the MONs seems to have e14269 
for the OSDs. I could only find the e14334 on the ceph-mon.cccs06.
The cccs06 has been the last MON standing (the node also reset). But when the 
cluster came back the MONs with the old epoch came up first and joined.

The syslog showed that these were the last log entries written. So the nodes 
reset shortly after. This would fit to cccs06 being the last MON alive.
# cccs01
Mar 22 22:16:09 cccs01 pmxcfs[2502]: [dcdb] notice: leader is 1/2502
Mar 22 22:16:09 cccs01 pmxcfs[2502]: [dcdb] notice: synced members: 1/2502, 
5/2219

# cccs02
Mar 22 22:15:57 cccs02 pmxcfs[2514]: [dcdb] notice: we (3/2514) left the 
process group
Mar 22 22:15:57 cccs02 pmxcfs[2514]: [dcdb] crit: leaving CPG group

# cccs06
Mar 22 22:31:16 cccs06 pmxcfs[2662]: [status] noMar 22 22:34:01 cccs06 
systemd-modules-load[773]: Inserted module 'iscsi_tcp'
Mar 22 22:34:01 cccs06 systemd-modules-load[773]: Inserted module 'ib_iser'

There must have been some issue prior to the reset as I found those error 
messages. They could explain why no older epoch was written anymore.
# cccs01
2020-03-22 22:22:15.661060 7fc09c85c100 -1 rocksdb: IO error: 
/var/lib/ceph/mon/ceph-cccs01/store.db/LOCK: Permission denied
2020-03-22 22:22:15.661067 7fc09c85c100 -1 error opening mon data directory at 
'/var/lib/ceph/mon/ceph-cccs01': (22) Invalid argument

# cccs02
2020-03-22 22:31:11.209524 7fd034786100 -1 rocksdb: IO error: 
/var/lib/ceph/mon/ceph-cccs02/store.db/LOCK: Permission denied
2020-03-22 22:31:11.209541 7fd034786100 -1 error opening mon data directory at 
'/var/lib/ceph/mon/ceph-cccs02': (22) Invalid argument

# cccs06
no such entries.

From my point of view, the last question is now: How to get the epoch from the 
OSD into the MON DB.
I have no answer to this yet.


Best Regards,
Parker Lau
ReadySpace Ltd - Cloud and Managed Hosting Professionals
Direct: +852 3726 1120
Hotiline: +852 3568 3372<tel:+852%203568%203372>
Fax: +852 3568 3376<tel:+852%203568%203376>
Website: www.readyspace.com.hk<http://www.readyspace.com.hk/>
Helpdesk: helpdesk.readyspace.com<http://helpdesk.readyspace.com/>
Get the latest update and promotion here:
twitter.com/readyspace<http://twitter.com/readyspace> | 
facebook.com/readyspace<http://www.facebook.com/readyspace>
[cid:image001.jpg@01D5CC61.CF90E050]Please consider the environment before 
printing this email.
Information in this message is confidential. It is intended solely for the 
person or the entity to whom it is addressed. If you are not the intended 
recipient, you are not to disseminate, distribute or copy this communication. 
Please notify the sender and delete the message and any other record of it from 
your system immediately.


CONFIDENTIALITY: This message is intended for the recipient named above. It may 
contain confidential or privileged information. If you are not the intended 
recipient, please notify the sender immediately by replying to this message and 
then delete it from your system. Do not read, copy, use or circulate this 
communication.   SECURITY: Internet communications are not secure. While 
reasonable effort has been made to ensure that this communication has not been 
tampered with, the sender cannot be responsible for alterations made to the 
contents of this message without his/her express consent. The sender also 
cannot be responsible for any computer virus/worm/ trojan horse or other 
program(s) accompanying this message which he/she is unaware of.

Please consider the environment before printing this email. Information in this 
message is confidential. It is intended solely for the person or the entity to 
whom it is addressed. If you are not the intended recipient, you are not to 
disseminate, distribute or copy this communication. Please notify the sender 
and delete the message and any other record of it from your system immediately.

寄件者: Parker Lau
寄件日期: Wednesday, 25 March 2020 12:55 am
收件者: 'ceph...@ceph.io' <ceph...@ceph.io<mailto:ceph...@ceph.io>>
主旨: Ceph fully crash and we unable to recovery

Hello Sir/Madam,

We are facing the serious problem for our proxmox with Ceph. I have already 
submitted the ticket to Proxmox but they said that only option trying to 
recover the mondb.We would like to know any some suggestion in our situation.

So far the only option that I see would be in trying to recover the mondb from 
an OSD. But this action is usually a last resort. Since I don't know the 
outcome, the cluster could then very well be dead and thus all data lost.
https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

---

> > I would like to set the nodown on the cluster to see if the OSDs are kept 
> > in the cluster.
> > The OSDs are joining the cluster but are set as down shortly after.
> >
> ok . Please go ahead.
Sadly this didn't have any effect either.

But I think I found a clue to what might be going on.
# ceph-osd.0.log
2020-03-24 21:22:06.462100 7fb33aab0e00 10 osd.0 0 read_superblock 
sb(e8e81549-91e5-4370-b091-9500f406a2b2 osd.0 
0bb2b9bb-9a70-4d6f-8d4e-3fc5049d63d6 e14334 [13578,14334] lci=[0,14334])

# ceph-mon.cccs01.log
2020-03-24 21:26:48.038345 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 e14299: 48 
total, 13 up, 35 in
2020-03-24 21:26:48.038351 7f7ef791a700  5 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 can_mark_out 
current in_ratio 0.729167 < min 0.75, will not mark osds out
2020-03-24 21:26:48.038360 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299 tick NOOUT 
flag set, not checking down osds
2020-03-24 21:26:48.038364 7f7ef791a700 10 
mon.cccs01@0(leader).osd<mailto:mon.cccs01@0(leader).osd> e14299  
min_last_epoch_clean 0

# ceph-mon.cccs06.log
2020-03-22 22:26:57.056939 7f3c1993a700  1 
mon.cccs06@5(peon).osd<mailto:mon.cccs06@5(peon).osd> e14333 e14333: 48 total, 
48 up, 48 in
2020-03-22 22:27:04.113054 7f3c1993a700  0 
mon.cccs06@5(peon)<mailto:mon.cccs06@5(peon)> e31 handle_command 
mon_command({"prefix":"df","format":"json"} v 0) v1
2020-03-22 22:27:04.113086 7f3c1993a700  0 log_channel(audit) log [DBG] : 
from='client.? 10.1.14.8:0/4265796352' entity='client.admin' 
cmd=[{"prefix":"df","format":"json"}]: dispatch
2020-03-22 22:27:09.752027 7f3c1993a700  1 
mon.cccs06@5(peon).osd<mailto:mon.cccs06@5(peon).osd> e14334 e14334: 48 total, 
48 up, 48 in
...
2020-03-23 10:42:51.891722 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
2020-03-23 10:42:51.891729 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
2020-03-23 10:42:51.891730 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 1009089991638532096, adjusting msgr requires
2020-03-23 10:42:51.891732 7ff1d9079700  0 
mon.cccs06@2(synchronizing).osd<mailto:mon.cccs06@2(synchronizing).osd> e14269 
crush map has features 288514051259236352, adjusting msgr requires
It seems that the OSD have a epoch of e14334 but the MONs seems to have e14269 
for the OSDs. I could only find the e14334 on the ceph-mon.cccs06.
The cccs06 has been the last MON standing (the node also reset). But when the 
cluster came back the MONs with the old epoch came up first and joined.

The syslog showed that these were the last log entries written. So the nodes 
reset shortly after. This would fit to cccs06 being the last MON alive.
# cccs01
Mar 22 22:16:09 cccs01 pmxcfs[2502]: [dcdb] notice: leader is 1/2502
Mar 22 22:16:09 cccs01 pmxcfs[2502]: [dcdb] notice: synced members: 1/2502, 
5/2219

# cccs02
Mar 22 22:15:57 cccs02 pmxcfs[2514]: [dcdb] notice: we (3/2514) left the 
process group
Mar 22 22:15:57 cccs02 pmxcfs[2514]: [dcdb] crit: leaving CPG group

# cccs06
Mar 22 22:31:16 cccs06 pmxcfs[2662]: [status] noMar 22 22:34:01 cccs06 
systemd-modules-load[773]: Inserted module 'iscsi_tcp'
Mar 22 22:34:01 cccs06 systemd-modules-load[773]: Inserted module 'ib_iser'

There must have been some issue prior to the reset as I found those error 
messages. They could explain why no older epoch was written anymore.
# cccs01
2020-03-22 22:22:15.661060 7fc09c85c100 -1 rocksdb: IO error: 
/var/lib/ceph/mon/ceph-cccs01/store.db/LOCK: Permission denied
2020-03-22 22:22:15.661067 7fc09c85c100 -1 error opening mon data directory at 
'/var/lib/ceph/mon/ceph-cccs01': (22) Invalid argument

# cccs02
2020-03-22 22:31:11.209524 7fd034786100 -1 rocksdb: IO error: 
/var/lib/ceph/mon/ceph-cccs02/store.db/LOCK: Permission denied
2020-03-22 22:31:11.209541 7fd034786100 -1 error opening mon data directory at 
'/var/lib/ceph/mon/ceph-cccs02': (22) Invalid argument

# cccs06
no such entries.

From my point of view, the last question is now: How to get the epoch from the 
OSD into the MON DB.
I have no answer to this yet.

Best Regards,
Parker Lau
ReadySpace Ltd - Cloud and Managed Hosting Professionals
Direct: +852 3726 1120
Hotiline: +852 3568 3372<tel:+852%203568%203372>
Fax: +852 3568 3376<tel:+852%203568%203376>
Website: www.readyspace.com.hk<http://www.readyspace.com.hk/>
Helpdesk: helpdesk.readyspace.com<http://helpdesk.readyspace.com/>
Get the latest update and promotion here:
twitter.com/readyspace<http://twitter.com/readyspace> | 
facebook.com/readyspace<http://www.facebook.com/readyspace>
[cid:image001.jpg@01D5CC61.CF90E050]Please consider the environment before 
printing this email.
Information in this message is confidential. It is intended solely for the 
person or the entity to whom it is addressed. If you are not the intended 
recipient, you are not to disseminate, distribute or copy this communication. 
Please notify the sender and delete the message and any other record of it from 
your system immediately.


CONFIDENTIALITY: This message is intended for the recipient named above. It may 
contain confidential or privileged information. If you are not the intended 
recipient, please notify the sender immediately by replying to this message and 
then delete it from your system. Do not read, copy, use or circulate this 
communication.   SECURITY: Internet communications are not secure. While 
reasonable effort has been made to ensure that this communication has not been 
tampered with, the sender cannot be responsible for alterations made to the 
contents of this message without his/her express consent. The sender also 
cannot be responsible for any computer virus/worm/ trojan horse or other 
program(s) accompanying this message which he/she is unaware of.

Please consider the environment before printing this email. Information in this 
message is confidential. It is intended solely for the person or the entity to 
whom it is addressed. If you are not the intended recipient, you are not to 
disseminate, distribute or copy this communication. Please notify the sender 
and delete the message and any other record of it from your system immediately.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph fully crash and we unable to recovery

Reply via email to