Hello,
I test ceph with v0.72

3 machines
each machine has about 10 osd
each machine has 1 mon
two machines deploy active/standby mds

After reboot two machines, the status of cluster became HEALTH_OK. But I
found osd.11 out. osd.11.log shows below:

2014-04-04 10:40:38.461566 7fb8856f1700  0 -- 192.160.8.22:0/2924 >>
192.160.8.24:6824/4055 pipe(0x5916280 sd=97 :0 s=1 pgs=0 cs=0 l=1
c=0x5c53e40).fault
2014-04-04 10:40:48.746926 7fb88e273700 -1 osd.11 501 *** Got signal
Terminated ***
2014-04-04 10:40:48.972408 7fb89f6b8700  0 monclient: hunting for new mon
2014-04-04 10:43:04.743035 7fd8b7f31780  0 ceph version 0.72-136-g340231d
(340231d07b2fe09c64f43c83d99668d6ccdafa49), process ceph-osd, pid 2924
2014-04-04 10:43:04.778431 7fd8b7f31780  1 filestore(/data/osd.11) mount
detected xfs
2014-04-04 10:43:04.778444 7fd8b7f31780  1 filestore(/data/osd.11)
 disabling 'filestore replica fadvise' due to known issues with
fadvise(DONTNEED) on xfs
2014-04-04 10:43:04.844471 7fd8b7f31780  0
genericfilestorebackend(/data/osd.11) detect_features: FIEMAP ioctl is
supported and appears to work
2014-04-04 10:43:04.844483 7fd8b7f31780  0
genericfilestorebackend(/data/osd.11) detect_features: FIEMAP ioctl is
disabled via 'filestore fiemap' config option
2014-04-04 10:43:04.885842 7fd8b7f31780  0
genericfilestorebackend(/data/osd.11) detect_features: syscall(SYS_syncfs,
fd) fully supported
2014-04-04 10:43:05.106793 7fd8b7f31780  0 filestore(/data/osd.11) mount:
enabling WRITEAHEAD journal mode: checkpoint is not enabled
2014-04-04 10:43:05.808399 7fd8b7f31780  1 journal _open
/dev/mapper/vg0-lv1 fd 20: 24746393600 bytes, block size 4096 bytes,
directio = 0, aio = 0
2014-04-04 10:43:06.126067 7fd8b7f31780  1 journal _open
/dev/mapper/vg0-lv1 fd 20: 24746393600 bytes, block size 4096 bytes,
directio = 0, aio = 0
2014-04-04 10:43:06.310994 7fd8b7f31780  1 journal close /dev/mapper/vg0-lv1
2014-04-04 10:43:06.313928 7fd8b7f31780  1 filestore(/data/osd.11) mount
detected xfs
2014-04-04 10:43:06.369708 7fd8b7f31780  0
genericfilestorebackend(/data/osd.11) detect_features: FIEMAP ioctl is
supported and appears to work
2014-04-04 10:43:06.369721 7fd8b7f31780  0
genericfilestorebackend(/data/osd.11) detect_features: FIEMAP ioctl is
disabled via 'filestore fiemap' config option
2014-04-04 10:43:06.419372 7fd8b7f31780  0
genericfilestorebackend(/data/osd.11) detect_features: syscall(SYS_syncfs,
fd) fully supported
2014-04-04 10:43:06.503796 7fd8b7f31780  0 filestore(/data/osd.11) mount:
enabling WRITEAHEAD journal mode: checkpoint is not enabled
2014-04-04 10:43:06.511013 7fd8b7f31780  1 journal _open
/dev/mapper/vg0-lv1 fd 21: 24746393600 bytes, block size 4096 bytes,
directio = 0, aio = 0
2014-04-04 10:43:06.516424 7fd8b7f31780  1 journal _open
/dev/mapper/vg0-lv1 fd 21: 24746393600 bytes, block size 4096 bytes,
directio = 0, aio = 0
2014-04-04 10:43:06.661511 7fd8b7f31780  0 <cls>
cls/hello/cls_hello.cc:271: loading cls_hello
2014-04-04 10:43:10.656604 7fd894026700 -1 msg/Pipe.cc: In function 'int
Pipe::connect()' thread 7fd894026700 time 2014-04-04 10:43:10.652182
msg/Pipe.cc: 1043: FAILED assert(m)

the log of thread 7fd894026700 shows this:
2014-04-04 10:43:10.656604 7fd894026700 -1 msg/Pipe.cc: In function 'int
Pipe::connect()' thread 7fd894026700 time 2014-04-04 10:43:10.652182
  -154> 2014-04-04 10:43:10.652092 7fd894026700  2 -- 192.160.8.22:6802/2924>>
192.160.4.23:6821/3355 pipe(0x58bd680 sd=74 :52809 s=1 pgs=0 cs=0 l=0
c=0x58e1a20). got newly_acked_seq 135 vs out_seq 0
  -151> 2014-04-04 10:43:10.652155 7fd894026700  2 -- 192.160.8.22:6802/2924>>
192.160.4.23:6821/3355 pipe(0x58bd680 sd=74 :52809 s=1 pgs=0 cs=0 l=0
c=0x58e1a20). discarding previously sent 0 pg_query(1.d epoch 501) v2
     0> 2014-04-04 10:43:10.656604 7fd894026700 -1 msg/Pipe.cc: In function
'int Pipe::connect()' thread 7fd894026700 time 2014-04-04 10:43:10.652182

I find osd.7 has the same entity_addr_t(192.160.8.22:6802/2924) when
connects to 192.160.4.23:6821/3355 before and after reboot.

This problem seems to like this issue(http://tracker.ceph.com/issues/6992),
but i don't know whether they are the same issue.

Some one can help?
Best regards.

houbin
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to