A quick update just to close out this thread:
After investigating with netstat I found one ceph-osd process had three TCP
connections in established state but with no connection state on the peer
system (the client node that previously had been using the RBD image). The
qemu process on the cli
I would double-check your file descriptor limits on both sides -- OSDs
and the client. 131 sockets shouldn't make a difference. Port is open
on any possible firewalls you have running?
On Mon, Apr 24, 2017 at 8:14 PM, Phil Lacroute
wrote:
> Yes it is the correct IP and port:
>
> ceph3:~$ netstat
Yes it is the correct IP and port:
ceph3:~$ netstat -anp | fgrep 192.168.206.13:6804
tcp0 0 192.168.206.13:6804 0.0.0.0:* LISTEN
22934/ceph-osd
I turned up the logging on the osd and I don’t think it received the request.
However I also noticed a large num
Just to cover all the bases, is 192.168.206.13:6804 really associated
with a running daemon for OSD 11?
On Mon, Apr 24, 2017 at 4:23 PM, Phil Lacroute
wrote:
> Jason,
>
> Thanks for the suggestion. That seems to show it is not the OSD that got
> stuck:
>
> ceph7:~$ sudo rbd -c debug/ceph.conf in
On 04/24/17 22:23, Phil Lacroute wrote:
> Jason,
>
> Thanks for the suggestion. That seems to show it is not the OSD that
> got stuck:
>
> ceph7:~$ sudo rbd -c debug/ceph.conf info app/image1
> …
> 2017-04-24 13:13:49.761076 7f739aefc700 1 --
> 192.168.206.17:0/1250293899 --> 192.168.206.13:6804/
Jason,
Thanks for the suggestion. That seems to show it is not the OSD that got stuck:
ceph7:~$ sudo rbd -c debug/ceph.conf info app/image1
…
2017-04-24 13:13:49.761076 7f739aefc700 1 -- 192.168.206.17:0/1250293899 -->
192.168.206.13:6804/22934 -- osd_op(client.4384.0:3 1.af6f1e38
rbd_header.
On Mon, Apr 24, 2017 at 2:53 PM, Phil Lacroute
wrote:
> 2017-04-24 11:30:57.058233 7f5512ffd700 1 -- 192.168.206.17:0/3282647735
> --> 192.168.206.13:6804/22934 -- osd_op(client.4375.0:3 1.af6f1e38
> rbd_header.1058238e1f29 [call rbd.get_size,call rbd.get_object_prefix] snapc
> 0=[] ack+read+know