Hi,

we are currently test cephfs with kernel module (4.17 and 4.18) instead fuse 
(worked fine),

and we have hang, iowait jump like crazy for around 20min.

client is a qemu 2.12 vm with virtio-net interface.


Is the client logs, we are seeing this kind of logs:

[jeu. nov.  8 12:20:18 2018] libceph: osd14 x.x.x.x:6801 socket closed (con 
state OPEN)
[jeu. nov.  8 12:42:03 2018] libceph: osd9 x.x.x.x:6821 socket closed (con 
state OPEN)


and in osd logs:

osd14:
2018-11-08 12:20:25.247 7f31ffac8700  0 -- x.x.x.x:6801/1745 >> 
x.x.x.x:0/3678871522 conn(0x558c430ec300 :6801 
s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg 
accept replacing existing (lossy) channel (new one lossy=1)

osd9:
2018-11-08 12:42:09.820 7f7ca970e700  0 -- x.x.x.x:6821/1739 >> 
x.x.x.x:0/3678871522 conn(0x564fcbec5100 :6821 
s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg 
accept replacing existing (lossy) channel (new one lossy=1)


cluster is ceph 13.2.1

Note that we have a physical firewall between client and server, I'm not sure 
yet if the session could be dropped. (I don't have find any logs in the 
firewall).

Any idea ? I would like to known if it's a network bug, or ceph bug (not sure 
how to understand the osd logs)

Regards,

Alexandre



client ceph.conf
----------------
[client]
fuse_disable_pagecache = true
client_reconnect_stale = true


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to