Hi,  several osds were down/out with similar logs as below, could you help?

  -38> 2014-06-05 10:27:54.700832 7f2ceead6700  1 -- 192.168.40.11:6800/19542 
<== osd.11 192.168.40.11:6822/20298 2 ==== pg_notify(0.aa4(2) epoch 7) v5 ==== 
812+0+0 (3873498789 0 0) 0x57a0540 con 0x49d14a0
   -37> 2014-06-05 10:27:54.700874 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.396463, event: header_read, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -36> 2014-06-05 10:27:54.701091 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.396465, event: throttled, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -35> 2014-06-05 10:27:54.701126 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.396543, event: all_read, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -34> 2014-06-05 10:27:54.701187 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.700870, event: dispatched, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -33> 2014-06-05 10:27:54.701235 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.701234, event: waiting_for_osdmap, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -32> 2014-06-05 10:27:54.701267 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.701267, event: started, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -31> 2014-06-05 10:27:54.700940 7f2ce92cb700  5 osd.0 pg_epoch: 7 pg[0.1dc1( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [69,56,21] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] exit Start 0.003321 0 0.000000
   -30> 2014-06-05 10:27:54.702163 7f2ceead6700  5 -- op tracker -- , seq: 
2028, time: 2014-06-05 10:27:54.702163, event: done, request: 
pg_notify(0.aa4(2) epoch 7) v5
   -29> 2014-06-05 10:27:54.701690 7f2ce92cb700  5 osd.0 pg_epoch: 7 pg[0.1dc1( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [69,56,21] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] enter Started/Stray
   -28> 2014-06-05 10:27:54.697805 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 3/3/3) [0,100,64] r=0 lpr=3 crt=0'0 mlcod 
0'0 active] exit Started/Primary/Active 7.527967 0 0.000000
   -27> 2014-06-05 10:27:54.702437 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 3/3/3) [0,100,64] r=0 lpr=3 crt=0'0 mlcod 
0'0 active] exit Started/Primary 8.986091 0 0.000000
   -26> 2014-06-05 10:27:54.702467 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 3/3/3) [0,100,64] r=0 lpr=3 crt=0'0 mlcod 
0'0 active] exit Started 8.986256 0 0.000000
   -25> 2014-06-05 10:27:54.702486 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 3/3/3) [0,100,64] r=0 lpr=3 crt=0'0 mlcod 
0'0 active] enter Reset
   -24> 2014-06-05 10:27:54.702822 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [54,95,68] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] exit Reset 0.000336 1 0.005574
   -23> 2014-06-05 10:27:54.702900 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [54,95,68] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] enter Started
   -22> 2014-06-05 10:27:54.703025 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [54,95,68] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] enter Start
   -21> 2014-06-05 10:27:54.703235 7f2ce8aca700  1 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [54,95,68] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
   -20> 2014-06-05 10:27:54.704919 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [54,95,68] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] exit Start 0.001894 0 0.000000
   -19> 2014-06-05 10:27:54.704835 7f2ceead6700  1 -- 192.168.40.11:6800/19542 
<== osd.15 192.168.40.11:6830/20581 1 ==== pg_notify(2.60(2) epoch 7) v5 ==== 
812+0+0 (3557830937 0 0) 0x4a6f540 con 0x3e3bb20
   -18> 2014-06-05 10:27:54.705050 7f2ceead6700  5 -- op tracker -- , seq: 
2029, time: 2014-06-05 10:27:54.598458, event: header_read, request: 
pg_notify(2.60(2) epoch 7) v5
   -17> 2014-06-05 10:27:54.705079 7f2ceead6700  5 -- op tracker -- , seq: 
2029, time: 2014-06-05 10:27:54.598461, event: throttled, request: 
pg_notify(2.60(2) epoch 7) v5
   -16> 2014-06-05 10:27:54.705124 7f2ce8aca700  5 osd.0 pg_epoch: 7 pg[1.98( 
empty local-les=4 n=0 ec=1 les/c 4/4 7/7/7) [54,95,68] r=-1 lpr=7 pi=3-6/1 
crt=0'0 inactive NOTIFY] enter Started/Stray
   -15> 2014-06-05 10:27:54.705353 7f2ceead6700  5 -- op tracker -- , seq: 
2029, time: 2014-06-05 10:27:54.598689, event: all_read, request: 
pg_notify(2.60(2) epoch 7) v5
   -14> 2014-06-05 10:27:54.706925 7f2ceead6700  5 -- op tracker -- , seq: 
2029, time: 2014-06-05 10:27:54.705046, event: dispatched, request: 
pg_notify(2.60(2) epoch 7) v5
   -13> 2014-06-05 10:27:54.707038 7f2ceead6700  5 -- op tracker -- , seq: 
2029, time: 2014-06-05 10:27:54.707037, event: waiting_for_osdmap, request: 
pg_notify(2.60(2) epoch 7) v5
   -12> 2014-06-05 10:27:54.707075 7f2ceead6700  5 -- op tracker -- , seq: 
2029, time: 2014-06-05 10:27:54.707075, event: started, request: 
pg_notify(2.60(2) epoch 7) v5
   -11> 2014-06-05 10:27:54.708042 7f2ceead6700  5 osd.0 pg_epoch: 7 
pg[2.60(unlocked)] enter Initial
   -10> 2014-06-05 10:27:54.709885 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=0 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] exit Initial 0.001842 0 0.000000
    -9> 2014-06-05 10:27:54.711556 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=0 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] enter Reset
    -8> 2014-06-05 10:27:54.711606 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] exit Reset 0.000051 1 0.001718
    -7> 2014-06-05 10:27:54.711648 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] enter Started
    -6> 2014-06-05 10:27:54.711714 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] enter Start
    -5> 2014-06-05 10:27:54.711781 7f2ceead6700  1 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] state<Start>: transitioning to Primary
    -4> 2014-06-05 10:27:54.713010 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] exit Start 0.001295 0 0.000000
    -3> 2014-06-05 10:27:54.713164 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] enter Started/Primary
    -2> 2014-06-05 10:27:54.713234 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 inactive] enter Started/Primary/Peering
    -1> 2014-06-05 10:27:54.713319 7f2ceead6700  5 osd.0 pg_epoch: 7 pg[2.60( 
empty local-les=0 n=0 ec=1 les/c 4/5 7/7/7) [0,117,49] r=0 lpr=7 pi=1-6/2 
crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetInfo
     0> 2014-06-05 10:27:54.711223 7f2ce92cb700 -1 common/Thread.cc: In 
function 'void Thread::create(size_t)' thread 7f2ce92cb700 time 2014-06-05 
10:27:54.703693
common/Thread.cc: 110: FAILED assert(ret == 0)

ceph version 0.80 (b78644e7dee100e48dfeca32c9270a6b210d3003)
1: (Thread::create(unsigned long)+0x8a) [0xa82dea]
2: (SimpleMessenger::connect_rank(entity_addr_t const&, int, Connection*, 
Message*)+0x17b) [0xa2b01b]
3: (SimpleMessenger::get_connection(entity_inst_t const&)+0x180) [0xa2f6c0]
4: (OSDService::get_con_osd_cluster(int, unsigned int)+0x1d9) [0x6061c9]
5: (OSD::compat_must_dispatch_immediately(PG*)+0x296) [0x6064a6]
6: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, 
ThreadPool::TPHandle&)+0x4a6) [0x6475d6]
7: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, 
ThreadPool::TPHandle&)+0x16) [0x69dbf6]
8: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0xab4581]
9: (ThreadPool::WorkThread::entry()+0x10) [0xab75c0]
10: /lib64/libpthread.so.0() [0x3eebc079d1]
11: (clone()+0x6d) [0x3eeb4e8b6d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.0.log
--- end dump of recent events ---
2014-06-05 10:27:55.096296 7f2ca8628700  0 -- 192.168.40.11:6800/19542 >> 
192.168.40.11:6808/19811 pipe(0x4adc100 sd=629 :6800 s=2 pgs=111 cs=1 l=0 
c=0x4a0c780).fault with nothing to send, going to standby
2014-06-05 10:27:55.129281 7f2ce92cb700 -1 *** Caught signal (Aborted) **
in thread 7f2ce92cb700

ceph version 0.80 (b78644e7dee100e48dfeca32c9270a6b210d3003)
1: /usr/bin/ceph-osd() [0x9aa211]
2: /lib64/libpthread.so.0() [0x3eebc0f710]
3: (gsignal()+0x35) [0x3eeb432925]
4: (abort()+0x175) [0x3eeb434105]
5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3eed8bea5d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

--- begin dump of recent events ---
    -6> 2014-06-05 10:27:54.716422 7f2cec2d1700  1 -- 192.168.40.11:6801/19542 
<== osd.119 192.168.40.14:0/29977 3 ==== osd_ping(ping e7 stamp 2014-06-05 
10:27:54.713281) v2 ==== 47+0+0 (243568413 0 0) 0x4a68380 con 0x3e3c200
    -5> 2014-06-05 10:27:54.716480 7f2cec2d1700  1 -- 192.168.40.11:6801/19542 
--> 192.168.40.14:0/29977 -- osd_ping(ping_reply e7 stamp 2014-06-05 
10:27:54.713281) v2 -- ?+0 0x5436740 con 0x3e3c200
    -4> 2014-06-05 10:27:54.717351 7f2ced2d3700  1 -- 192.168.50.11:6801/19542 
<== osd.119 192.168.40.14:0/29977 3 ==== osd_ping(ping e7 stamp 2014-06-05 
10:27:54.713281) v2 ==== 47+0+0 (243568413 0 0) 0x4a68fc0 con 0x3e3c8e0
    -3> 2014-06-05 10:27:55.096211 7f2ca8628700  2 -- 192.168.40.11:6800/19542 
>> 192.168.40.11:6808/19811 pipe(0x4adc100 sd=629 :6800 s=2 pgs=111 cs=1 l=0 
c=0x4a0c780).reader couldn't read tag, (0) Success
    -2> 2014-06-05 10:27:55.096258 7f2ca8628700  2 -- 192.168.40.11:6800/19542 
>> 192.168.40.11:6808/19811 pipe(0x4adc100 sd=629 :6800 s=2 pgs=111 cs=1 l=0 
c=0x4a0c780).fault (0) Success
    -1> 2014-06-05 10:27:55.096296 7f2ca8628700  0 -- 192.168.40.11:6800/19542 
>> 192.168.40.11:6808/19811 pipe(0x4adc100 sd=629 :6800 s=2 pgs=111 cs=1 l=0 
c=0x4a0c780).fault with nothing to send, going to standby
     0> 2014-06-05 10:27:55.129281 7f2ce92cb700 -1 *** Caught signal (Aborted) 
**
in thread 7f2ce92cb700

ceph version 0.80 (b78644e7dee100e48dfeca32c9270a6b210d3003)
1: /usr/bin/ceph-osd() [0x9aa211]
2: /lib64/libpthread.so.0() [0x3eebc0f710]
3: (gsignal()+0x35) [0x3eeb432925]
4: (abort()+0x175) [0x3eeb434105]
5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3eed8bea5d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.0.log
--- end dump of recent events ---


Wei Cao (Buddy)

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to