Dear all,
I got to an unrecoverable crash at one specific OSD, every time I try to
restart it. It happened first at firefly 0.80.8, I updated to 0.80.10,
but it continued to happen.
Due to this failure, I have several PGs down+peering, that won't recover
even marking the OSD out.
Could someone help me? Is it possible to edit/rebuild the leveldb-based
log that seems to be causing the problem?
Here is what the logfile informs me:
[(12:54:45) root@spcsnp2 ~]# service ceph start osd.31
=== osd.31 ===
create-or-move updated item name 'osd.31' weight 2.73 at location
{host=spcsnp2,root=default} to crush map
Starting Ceph osd.31 on spcsnp2...
starting osd.31 at :/0 osd_data /var/lib/ceph/osd/ceph-31
/var/lib/ceph/osd/ceph-31/journal
2015-08-07 12:55:12.916880 7fd614c8f780 0 ceph version 0.80.10
(ea6c958c38df1216bf95c927f143d8b13c4a9e70), process ceph-osd, pid 23260
[(12:55:12) root@spcsnp2 ~]# 2015-08-07 12:55:12.928614 7fd614c8f780 0
filestore(/var/lib/ceph/osd/ceph-31) mount detected xfs (libxfs)
2015-08-07 12:55:12.928622 7fd614c8f780 1
filestore(/var/lib/ceph/osd/ceph-31) disabling 'filestore replica
fadvise' due to known issues with fadvise(DONTNEED) on xfs
2015-08-07 12:55:12.931410 7fd614c8f780 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_features:
FIEMAP ioctl is supported and appears to work
2015-08-07 12:55:12.931419 7fd614c8f780 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_features:
FIEMAP ioctl is disabled via 'filestore fiemap' config option
2015-08-07 12:55:12.939290 7fd614c8f780 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_features:
syscall(SYS_syncfs, fd) fully supported
2015-08-07 12:55:12.939326 7fd614c8f780 0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_feature: extsize
is disabled by conf
2015-08-07 12:55:45.587019 7fd614c8f780 -1 *** Caught signal (Aborted) **
in thread 7fd614c8f780
ceph version 0.80.10 (ea6c958c38df1216bf95c927f143d8b13c4a9e70)
1: /usr/bin/ceph-osd() [0xab7562]
2: (()+0xf030) [0x7fd6141ce030]
3: (gsignal()+0x35) [0x7fd612d41475]
4: (abort()+0x180) [0x7fd612d446f0]
5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fd61359689d]
6: (()+0x63996) [0x7fd613594996]
7: (()+0x639c3) [0x7fd6135949c3]
8: (()+0x63bee) [0x7fd613594bee]
9: (tc_new()+0x48e) [0x7fd614414aee]
10: (std::string::_Rep::_S_create(unsigned long, unsigned long,
std::allocator<char> const&)+0x59) [0x7fd6135f0999]
11: (std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned
long)+0x28) [0x7fd6135f1708]
12: (std::string::reserve(unsigned long)+0x30) [0x7fd6135f17f0]
13: (std::string::append(char const*, unsigned long)+0xb5)
[0x7fd6135f1ab5]
14: (leveldb::log::Reader::ReadRecord(leveldb::Slice*,
std::string*)+0x2a2) [0x7fd614670fa2]
15: (leveldb::DBImpl::RecoverLogFile(unsigned long,
leveldb::VersionEdit*, unsigned long*)+0x180) [0x7fd614669360]
16: (leveldb::DBImpl::Recover(leveldb::VersionEdit*)+0x5c2)
[0x7fd61466bdf2]
17: (leveldb::DB::Open(leveldb::Options const&, std::string const&,
leveldb::DB**)+0xff) [0x7fd61466c11f]
18: (LevelDBStore::do_open(std::ostream&, bool)+0xd8) [0xa123a8]
19: (FileStore::mount()+0x18e0) [0x9b7080]
20: (OSD::do_convertfs(ObjectStore*)+0x1a) [0x78f52a]
21: (main()+0x2234) [0x7331c4]
22: (__libc_start_main()+0xfd) [0x7fd612d2dead]
23: /usr/bin/ceph-osd() [0x736e99]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
--- begin dump of recent events ---
-56> 2015-08-07 12:55:12.915675 7fd614c8f780 5 asok(0x1a20230)
register_command perfcounters_dump hook 0x1a10010
-55> 2015-08-07 12:55:12.915697 7fd614c8f780 5 asok(0x1a20230)
register_command 1 hook 0x1a10010
-54> 2015-08-07 12:55:12.915700 7fd614c8f780 5 asok(0x1a20230)
register_command perf dump hook 0x1a10010
-53> 2015-08-07 12:55:12.915704 7fd614c8f780 5 asok(0x1a20230)
register_command perfcounters_schema hook 0x1a10010
-52> 2015-08-07 12:55:12.915706 7fd614c8f780 5 asok(0x1a20230)
register_command 2 hook 0x1a10010
-51> 2015-08-07 12:55:12.915709 7fd614c8f780 5 asok(0x1a20230)
register_command perf schema hook 0x1a10010
-50> 2015-08-07 12:55:12.915711 7fd614c8f780 5 asok(0x1a20230)
register_command config show hook 0x1a10010
-49> 2015-08-07 12:55:12.915714 7fd614c8f780 5 asok(0x1a20230)
register_command config set hook 0x1a10010
-48> 2015-08-07 12:55:12.915716 7fd614c8f780 5 asok(0x1a20230)
register_command config get hook 0x1a10010
-47> 2015-08-07 12:55:12.915718 7fd614c8f780 5 asok(0x1a20230)
register_command log flush hook 0x1a10010
-46> 2015-08-07 12:55:12.915721 7fd614c8f780 5 asok(0x1a20230)
register_command log dump hook 0x1a10010
-45> 2015-08-07 12:55:12.915723 7fd614c8f780 5 asok(0x1a20230)
register_command log reopen hook 0x1a10010
-44> 2015-08-07 12:55:12.916880 7fd614c8f780 0 ceph version 0.80.10
(ea6c958c38df1216bf95c927f143d8b13c4a9e70), process ceph-osd, pid 23260
-43> 2015-08-07 12:55:12.918156 7fd614c8f780 1 -- 10.17.0.6:0/0
learned my addr 10.17.0.6:0/0
-42> 2015-08-07 12:55:12.918164 7fd614c8f780 1
accepter.accepter.bind my_inst.addr is 10.17.0.6:6812/23260 need_addr=0
-41> 2015-08-07 12:55:12.918178 7fd614c8f780 1 -- 10.18.0.6:0/0
learned my addr 10.18.0.6:0/0
-40> 2015-08-07 12:55:12.918180 7fd614c8f780 1
accepter.accepter.bind my_inst.addr is 10.18.0.6:6810/23260 need_addr=0
-39> 2015-08-07 12:55:12.918191 7fd614c8f780 1 -- 10.18.0.6:0/0
learned my addr 10.18.0.6:0/0
-38> 2015-08-07 12:55:12.918192 7fd614c8f780 1
accepter.accepter.bind my_inst.addr is 10.18.0.6:6811/23260 need_addr=0
-37> 2015-08-07 12:55:12.918202 7fd614c8f780 1 -- 10.17.0.6:0/0
learned my addr 10.17.0.6:0/0
-36> 2015-08-07 12:55:12.918204 7fd614c8f780 1
accepter.accepter.bind my_inst.addr is 10.17.0.6:6815/23260 need_addr=0
-35> 2015-08-07 12:55:12.918214 7fd614c8f780 1 -- 10.17.0.6:0/0
learned my addr 10.17.0.6:0/0
-34> 2015-08-07 12:55:12.918216 7fd614c8f780 1
accepter.accepter.bind my_inst.addr is 10.17.0.6:6816/23260 need_addr=0
-33> 2015-08-07 12:55:12.925154 7fd614c8f780 1 finished
global_init_daemonize
-32> 2015-08-07 12:55:12.927746 7fd614c8f780 5 asok(0x1a20230) init
/var/run/ceph/ceph-osd.31.asok
-31> 2015-08-07 12:55:12.927760 7fd614c8f780 5 asok(0x1a20230)
bind_and_listen /var/run/ceph/ceph-osd.31.asok
-30> 2015-08-07 12:55:12.927828 7fd614c8f780 5 asok(0x1a20230)
register_command 0 hook 0x1a0e0b0
-29> 2015-08-07 12:55:12.927837 7fd614c8f780 5 asok(0x1a20230)
register_command version hook 0x1a0e0b0
-28> 2015-08-07 12:55:12.927840 7fd614c8f780 5 asok(0x1a20230)
register_command git_version hook 0x1a0e0b0
-27> 2015-08-07 12:55:12.927843 7fd614c8f780 5 asok(0x1a20230)
register_command help hook 0x1a100b0
-26> 2015-08-07 12:55:12.927845 7fd614c8f780 5 asok(0x1a20230)
register_command get_command_descriptions hook 0x1a10150
-25> 2015-08-07 12:55:12.927861 7fd61094c700 5 asok(0x1a20230)
entry start
-24> 2015-08-07 12:55:12.928614 7fd614c8f780 0
filestore(/var/lib/ceph/osd/ceph-31) mount detected xfs (libxfs)
-23> 2015-08-07 12:55:12.928622 7fd614c8f780 1
filestore(/var/lib/ceph/osd/ceph-31) disabling 'filestore replica
fadvise' due to known issues with fadvise(DONTNEED) on xfs
-22> 2015-08-07 12:55:12.931410 7fd614c8f780 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_features:
FIEMAP ioctl is supported and appears to work
-21> 2015-08-07 12:55:12.931419 7fd614c8f780 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_features:
FIEMAP ioctl is disabled via 'filestore fiemap' config option
-20> 2015-08-07 12:55:12.939290 7fd614c8f780 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_features:
syscall(SYS_syncfs, fd) fully supported
-19> 2015-08-07 12:55:12.939326 7fd614c8f780 0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-31) detect_feature: extsize
is disabled by conf
-18> 2015-08-07 12:55:16.785686 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'get_command_descriptions' '' to 0x1a10150 returned
1164 bytes
-17> 2015-08-07 12:55:16.788515 7fd61094c700 1 do_command 'config
get' 'format:json var:fsid
-16> 2015-08-07 12:55:16.788546 7fd61094c700 1 do_command 'config
get' 'format:json var:fsid result is 47 bytes
-15> 2015-08-07 12:55:16.788549 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'config get' '' to 0x1a10010 returned 47 bytes
-14> 2015-08-07 12:55:16.788748 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'get_command_descriptions' '' to 0x1a10150 returned
1164 bytes
-13> 2015-08-07 12:55:16.790540 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'version' '' to 0x1a0e0b0 returned 21 bytes
-12> 2015-08-07 12:55:26.022803 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'get_command_descriptions' '' to 0x1a10150 returned
1164 bytes
-11> 2015-08-07 12:55:26.025710 7fd61094c700 1 do_command 'config
get' 'format:json var:fsid
-10> 2015-08-07 12:55:26.025725 7fd61094c700 1 do_command 'config
get' 'format:json var:fsid result is 47 bytes
-9> 2015-08-07 12:55:26.025727 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'config get' '' to 0x1a10010 returned 47 bytes
-8> 2015-08-07 12:55:26.025883 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'get_command_descriptions' '' to 0x1a10150 returned
1164 bytes
-7> 2015-08-07 12:55:26.027690 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'version' '' to 0x1a0e0b0 returned 21 bytes
-6> 2015-08-07 12:55:36.291878 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'get_command_descriptions' '' to 0x1a10150 returned
1164 bytes
-5> 2015-08-07 12:55:36.294711 7fd61094c700 1 do_command 'config
get' 'format:json var:fsid
-4> 2015-08-07 12:55:36.294729 7fd61094c700 1 do_command 'config
get' 'format:json var:fsid result is 47 bytes
-3> 2015-08-07 12:55:36.294732 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'config get' '' to 0x1a10010 returned 47 bytes
-2> 2015-08-07 12:55:36.294936 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'get_command_descriptions' '' to 0x1a10150 returned
1164 bytes
-1> 2015-08-07 12:55:36.296827 7fd61094c700 5 asok(0x1a20230)
AdminSocket: request 'version' '' to 0x1a0e0b0 returned 21 bytes
0> 2015-08-07 12:55:45.587019 7fd614c8f780 -1 *** Caught signal
(Aborted) **
in thread 7fd614c8f780
ceph version 0.80.10 (ea6c958c38df1216bf95c927f143d8b13c4a9e70)
1: /usr/bin/ceph-osd() [0xab7562]
2: (()+0xf030) [0x7fd6141ce030]
3: (gsignal()+0x35) [0x7fd612d41475]
4: (abort()+0x180) [0x7fd612d446f0]
5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fd61359689d]
6: (()+0x63996) [0x7fd613594996]
7: (()+0x639c3) [0x7fd6135949c3]
8: (()+0x63bee) [0x7fd613594bee]
9: (tc_new()+0x48e) [0x7fd614414aee]
10: (std::string::_Rep::_S_create(unsigned long, unsigned long,
std::allocator<char> const&)+0x59) [0x7fd6135f0999]
11: (std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned
long)+0x28) [0x7fd6135f1708]
12: (std::string::reserve(unsigned long)+0x30) [0x7fd6135f17f0]
13: (std::string::append(char const*, unsigned long)+0xb5)
[0x7fd6135f1ab5]
14: (leveldb::log::Reader::ReadRecord(leveldb::Slice*,
std::string*)+0x2a2) [0x7fd614670fa2]
15: (leveldb::DBImpl::RecoverLogFile(unsigned long,
leveldb::VersionEdit*, unsigned long*)+0x180) [0x7fd614669360]
16: (leveldb::DBImpl::Recover(leveldb::VersionEdit*)+0x5c2)
[0x7fd61466bdf2]
17: (leveldb::DB::Open(leveldb::Options const&, std::string const&,
leveldb::DB**)+0xff) [0x7fd61466c11f]
18: (LevelDBStore::do_open(std::ostream&, bool)+0xd8) [0xa123a8]
19: (FileStore::mount()+0x18e0) [0x9b7080]
20: (OSD::do_convertfs(ObjectStore*)+0x1a) [0x78f52a]
21: (main()+0x2234) [0x7331c4]
22: (__libc_start_main()+0xfd) [0x7fd612d2dead]
23: /usr/bin/ceph-osd() [0x736e99]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 keyvaluestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.31.log
--- end dump of recent events ---
--
--
As informações contidas nesta mensagem são CONFIDENCIAIS, protegidas pelo
sigilo legal e por direitos autorais. A divulgação, distribuição, reprodução ou
qualquer forma de utilização do teor deste documento depende de autorização do
emissor, sujeitando-se o infrator às sanções legais. Caso esta comunicação
tenha sido recebida por engano, favor avisar imediatamente, respondendo esta
mensagem.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com