Hello community,
we have developed a cluster on latest mimic release. We are on quite old
hardware, but using Centos7. Monitor, manager and all the same host.
Cluster has been running for some week without actual workload. There
might have been some sort of power failure (not proved), but at some
point monitor node died and won't start anymore. Below is a log from
/var/log/messages. What can be done here? Can this be recovered somehow
or did we loose everything? All the OSDs seems to be running fine, just
that the cluster is not working.
The log is not full, but I think that those line are quite critical..
Jun 27 17:14:06 mds1 ceph-mon: -311> 2019-06-27 17:14:06.169
7f086aa22700 -1 *rocksdb: submit_common error: Corruption: block
checksum mismatch*: expected 3317957558, got 2609532897 in
/var/lib/ceph/mon/ceph-mds1/store.db/022334.sst offset 12775887 size
21652 code = 2 Rocksdb transaction:
Jun 27 17:14:06 mds1 ceph-mon: Put( Prefix = p key =
'xos'0x006c6173't_committed' Value size = 8)
Jun 27 17:14:06 mds1 ceph-mon: Put( Prefix = m key =
'nitor_store'0x006c6173't_metadata' Value size = 612)
Jun 27 17:14:06 mds1 ceph-mon: Put( Prefix = l key =
'gm'0x0066756c'l_155850' Value size = 31307)
Jun 27 17:14:06 mds1 ceph-mon: Put( Prefix = l key =
'gm'0x0066756c'l_latest' Value size = 8)
Jun 27 17:14:06 mds1 ceph-mon: Put( Prefix = l key = 'gm'0x00313535'851'
Value size = 672)
Jun 27 17:14:06 mds1 ceph-mon: Put( Prefix = l key =
'gm'0x006c6173't_committed' Value size = 8)
Jun 27 17:14:06 mds1 ceph-mon: -311> 2019-06-27 17:14:06.172
7f086aa22700 -1
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE
_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/mon/MonitorDBStore.h:
In function
'int
MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef)'
thread 7f086aa22700 time 2019-06-27 17:14:06.171474
Jun 27 17:14:06 mds1 ceph-mon:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/cento
s7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/mon/MonitorDBStore.h:
311: FAILED assert(0 ==*"failed to write to db"*)
Jun 27 17:14:06 mds1 ceph-mon: ceph version 13.2.6
(7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com