Re: [ceph-users] Kernel RBD hang on OSD Failure

Tom Christensen Tue, 08 Dec 2015 02:54:07 -0800

To be clear, we are also using format 2 RBDs, so we didn't really expect it
to work until recently as it was listed as unsupported.  We are under the
understanding that as of 3.19 RBD format 2 should be supported.  Are we
incorrect in that understanding?


On Tue, Dec 8, 2015 at 3:44 AM, Tom Christensen <pav...@gmail.com> wrote:

> We haven't submitted a ticket as we've just avoided using the kernel
> client.  We've periodically tried with various kernels and various versions
> of ceph over the last two years, but have just given up each time and
> reverted to using rbd-fuse, which although not super stable, at least
> doesn't hang the client box.  We find ourselves in the position now where
> for additional functionality we *need* an actual block device, so we have
> to find a kernel client that works.  I will certainly keep you posted and
> can produce the output you've requested.
>
> I'd also be willing to run an early 4.5 version in our test environment.
>
> On Tue, Dec 8, 2015 at 3:35 AM, Ilya Dryomov <idryo...@gmail.com> wrote:
>
>> On Tue, Dec 8, 2015 at 10:57 AM, Tom Christensen <pav...@gmail.com>
>> wrote:
>> > We aren't running NFS, but regularly use the kernel driver to map RBDs
>> and
>> > mount filesystems in same.  We see very similar behavior across nearly
>> all
>> > kernel versions we've tried.  In my experience only very few versions
>> of the
>> > kernel driver survive any sort of crush map change/update while
>> something is
>> > mapped.  In fact in the last 2 years I think I've only seen this work
>> on 1
>> > kernel version unfortunately its badly out of date and we can't run it
>> in
>> > our environment anymore, I think it was a 3.0 kernel version running on
>> > ubuntu 12.04.  We have just recently started trying to find a kernel
>> that
>> > will survive OSD outages or changes to the cluster.  We're on ubuntu
>> 14.04,
>> > and have tried 3.16, 3.19.0-25, 4.3, and 4.2 without success in the last
>> > week.  We only map 1-3 RBDs per client machine at a time but we
>> regularly
>> > will get processes stuck in D state which are accessing the filesystem
>> > inside the RBD and will have to hard reboot the RBD client machine.
>> This is
>> > always associated with a cluster change in some way, reweighting OSDs,
>> > rebooting an OSD host, restarting an individual OSD, adding OSDs, and
>> > removing OSDs all cause the kernel client to hang.  If no change is
>> made to
>> > the cluster, the kernel client will be happy for weeks.
>>
>> There are a couple of known bugs in the remap/resubmit area, but those
>> are supposedly corner cases (like *all* the OSDs going down and then
>> back up, etc).  I had no idea it was that severe and goes that back.
>> Apparently triggering it requires a heavier load, as we've never seen
>> anything like that in our tests.
>>
>> For unrelated reasons, remap/resubmit code is getting entirely
>> rewritten for kernel 4.5, so, if you've been dealing with this issue
>> for the last two years (I don't remember seeing any tickets listing
>> that many kernel versions and not mentioning NFS), I'm afraid the best
>> course of action for you would be to wait for 4.5 to come out and try
>> it.  If you'd be willing to test out an early version on one of more of
>> your client boxes, I can ping you when it's ready.
>>
>> I'll take a look at 3.0 vs 3.16 with an eye on remap code.  Did you
>> happen to try 3.10?
>>
>> It sounds like you can reproduce this pretty easily.  Can you get it to
>> lock up and do:
>>
>> # cat /sys/kernel/debug/ceph/*/osdmap
>> # cat /sys/kernel/debug/ceph/*/osdc
>> $ ceph status
>>
>> and bunch of times?  I have a hunch that kernel client simply fails to
>> request enough of new osdmaps after the cluster topology changes under
>> load.
>>
>> Thanks,
>>
>>                 Ilya
>>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Kernel RBD hang on OSD Failure

Reply via email to