Hi,

I am facing issues with some of my rbd volumes since yesterday. Some of
them completely hang at some point before eventually resuming IO, may it be
a few minutes or several hours later.

First and foremost, my setup : I already detailed it on the mailing list
[0][1]. Some changes have been made : the 3 monitors are now VM and we are
trying kernel 4.4.5 on the clients (cluster is still 3.10 centos7).

Using EC pools, I already had some trouble with RBD features not supported
by EC [2] and changed min_recency_* to 0 about 2 weeks ago to avoid the
hassle. Everything has been working pretty smoothly since.

All my volumes (currently 5) are on an EC pool with writeback cache. Two of
them are perfectly fine. On the other 3, different story : doing IO is
impossible, if I start a simple copy I get a new file of a few dozen MB (or
sometimes 0) then it hangs. Doing dd with direct and sync flags has the
same behaviour.

I tried witching back to 3.10, no changes, on the client I rebooted I
currently cannot mount the filesystem, mount hangs (the volume seems
correctly mapped however).

strace on the cp command freezes in the middle of a read :

11:17:56 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 read(3,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
65536) = 65536
11:17:56 read(3,


I tried to bump up the logging but I don't really know what to look for
exactly and didn't see anything obvious.

Any input or lead on how to debug this would be highly appreciated :)

Adrien

[0] http://www.spinics.net/lists/ceph-users/msg23990.html
[1]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-January/007004.html
[2]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-February/007746.html
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to