On Mon, Mar 27, 2017 at 11:17 PM, Peter Maloney < peter.malo...@brockmann-consult.de> wrote:
> I can't guarantee it's the same as my issue, but from that it sounds the > same. > > Jewel 10.2.4, 10.2.5 tested > hypervisors are proxmox qemu-kvm, using librbd > 3 ceph nodes with mon+osd on each > > -faster journals, more disks, bcache, rbd_cache, fewer VMs on ceph, iops > and bw limits on client side, jumbo frames, etc. all improve/smooth out > performance and mitigate the hangs, but don't prevent it. > -hangs are usually associated with blocked requests (I set the complaint > time to 5s to see them) > -hangs are very easily caused by rbd snapshot + rbd export-diff to do > incremental backup (one snap persistent, plus one more during backup) > -when qemu VM io hangs, I have to kill -9 the qemu process for it to > stop. Some broken VMs don't appear to be hung until I try to live > migrate them (live migrating all VMs helped test solutions) > > Finally I have a workaround... disable exclusive-lock, object-map, and > fast-diff rbd features (and restart clients via live migrate). > (object-map and fast-diff appear to have no effect on dif or export-diff > ... so I don't miss them). I'll file a bug at some point (after I move > all VMs back and see if it is still stable). And one other user on IRC > said this solved the same problem (also using rbd snapshots). > > And strangely, they don't seem to hang if I put back those features, > until a few days later (making testing much less easy...but now I'm very > sure removing them prevents the issue) > > I hope this works for you (and maybe gets some attention from devs too), > so you don't waste months like me. > > On 03/27/17 19:31, Hall, Eric wrote: > > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and > ceph hosts, we occasionally see hung processes (usually during boot, but > otherwise as well), with errors reported in the instance logs as shown > below. Configuration is vanilla, based on openstack/ceph docs. > > > > Neither the compute hosts nor the ceph hosts appear to be overloaded in > terms of memory or network bandwidth, none of the 67 osds are over 80% > full, nor do any of them appear to be overwhelmed in terms of IO. Compute > hosts and ceph cluster are connected via a relatively quiet 1Gb network, > with an IBoE net between the ceph nodes. Neither network appears > overloaded. > > > > I don’t see any related (to my eye) errors in client or server logs, > even with 20/20 logging from various components (rbd, rados, client, > objectcacher, etc.) I’ve increased the qemu file descriptor limit > (currently 64k... overkill for sure.) > > > > I “feels” like a performance problem, but I can’t find any capacity > issues or constraining bottlenecks. > > > > Any suggestions or insights into this situation are appreciated. Thank > you for your time, > > -- > > Eric > > > > > > [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more > than 120 seconds. > > [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu > > [Fri Mar 24 20:30:40 2017] "echo 0 > > > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D ffff88043fd13180 0 > 226 2 0x00000000 > > [Fri Mar 24 20:30:40 2017] ffff88003728bbd8 0000000000000046 > ffff880426900000 ffff88003728bfd8 > > [Fri Mar 24 20:30:40 2017] 0000000000013180 0000000000013180 > ffff880426900000 ffff88043fd13a18 > > [Fri Mar 24 20:30:40 2017] ffff88043ffb9478 0000000000000002 > ffffffff811ef7c0 ffff88003728bc50 > > [Fri Mar 24 20:30:40 2017] Call Trace: > > [Fri Mar 24 20:30:40 2017] [<ffffffff811ef7c0>] ? > generic_block_bmap+0x50/0x50 > > [Fri Mar 24 20:30:40 2017] [<ffffffff81726d2d>] io_schedule+0x9d/0x140 > > [Fri Mar 24 20:30:40 2017] [<ffffffff811ef7ce>] sleep_on_buffer+0xe/0x20 > > [Fri Mar 24 20:30:40 2017] [<ffffffff817271b2>] __wait_on_bit+0x62/0x90 > > [Fri Mar 24 20:30:40 2017] [<ffffffff811ef7c0>] ? > generic_block_bmap+0x50/0x50 > > [Fri Mar 24 20:30:40 2017] [<ffffffff81727257>] > out_of_line_wait_on_bit+0x77/0x90 > > [Fri Mar 24 20:30:40 2017] [<ffffffff810ab180>] ? > autoremove_wake_function+0x40/0x40 > > [Fri Mar 24 20:30:40 2017] [<ffffffff811f0afa>] > __wait_on_buffer+0x2a/0x30 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8128bb4d>] jbd2_journal_commit_ > transaction+0x185d/0x1ab0 > > [Fri Mar 24 20:30:40 2017] [<ffffffff810755df>] ? > try_to_del_timer_sync+0x4f/0x70 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8128fe7d>] kjournald2+0xbd/0x250 > > [Fri Mar 24 20:30:40 2017] [<ffffffff810ab140>] ? > prepare_to_wait_event+0x100/0x100 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8128fdc0>] ? > commit_timeout+0x10/0x10 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8108b5d2>] kthread+0xd2/0xf0 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8108b500>] ? > kthread_create_on_node+0x1c0/0x1c0 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8173304c>] ret_from_fork+0x7c/0xb0 > > [Fri Mar 24 20:30:40 2017] [<ffffffff8108b500>] ? > kthread_create_on_node+0x1c0/0x1c0 > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > Hi, We are using these settings on hypervisors in openstack: vm.dirty_ratio = 40 vm.dirty_background_ratio = 5 And these on vms: vm.dirty_ratio = 10 vm.dirty_background_ratio = 5 In our case it prevents vms from crashing. -- Marius Vaitiekūnas
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com