Re: [ceph-users] disk timeouts in libvirt/qemu VMs...

Marius Vaitiekunas Tue, 28 Mar 2017 01:06:07 -0700

On Mon, Mar 27, 2017 at 11:17 PM, Peter Maloney <
peter.malo...@brockmann-consult.de> wrote:


> I can't guarantee it's the same as my issue, but from that it sounds the
> same.
>
> Jewel 10.2.4, 10.2.5 tested
> hypervisors are proxmox qemu-kvm, using librbd
> 3 ceph nodes with mon+osd on each
>
> -faster journals, more disks, bcache, rbd_cache, fewer VMs on ceph, iops
> and bw limits on client side, jumbo frames, etc. all improve/smooth out
> performance and mitigate the hangs, but don't prevent it.
> -hangs are usually associated with blocked requests (I set the complaint
> time to 5s to see them)
> -hangs are very easily caused by rbd snapshot + rbd export-diff to do
> incremental backup (one snap persistent, plus one more during backup)
> -when qemu VM io hangs, I have to kill -9 the qemu process for it to
> stop. Some broken VMs don't appear to be hung until I try to live
> migrate them (live migrating all VMs helped test solutions)
>
> Finally I have a workaround... disable exclusive-lock, object-map, and
> fast-diff rbd features (and restart clients via live migrate).
> (object-map and fast-diff appear to have no effect on dif or export-diff
> ... so I don't miss them). I'll file a bug at some point (after I move
> all VMs back and see if it is still stable). And one other user on IRC
> said this solved the same problem (also using rbd snapshots).
>
> And strangely, they don't seem to hang if I put back those features,
> until a few days later (making testing much less easy...but now I'm very
> sure removing them prevents the issue)
>
> I hope this works for you (and maybe gets some attention from devs too),
> so you don't waste months like me.
>
> On 03/27/17 19:31, Hall, Eric wrote:
> > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel),
> using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and
> ceph hosts, we occasionally see hung processes (usually during boot, but
> otherwise as well), with errors reported in the instance logs as shown
> below.  Configuration is vanilla, based on openstack/ceph docs.
> >
> > Neither the compute hosts nor the ceph hosts appear to be overloaded in
> terms of memory or network bandwidth, none of the 67 osds are over 80%
> full, nor do any of them appear to be overwhelmed in terms of IO.  Compute
> hosts and ceph cluster are connected via a relatively quiet 1Gb network,
> with an IBoE net between the ceph nodes.  Neither network appears
> overloaded.
> >
> > I don’t see any related (to my eye) errors in client or server logs,
> even with 20/20 logging from various components (rbd, rados, client,
> objectcacher, etc.)  I’ve increased the qemu file descriptor limit
> (currently 64k... overkill for sure.)
> >
> > I “feels” like a performance problem, but I can’t find any capacity
> issues or constraining bottlenecks.
> >
> > Any suggestions or insights into this situation are appreciated.  Thank
> you for your time,
> > --
> > Eric
> >
> >
> > [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more
> than 120 seconds.
> > [Fri Mar 24 20:30:40 2017]       Not tainted 3.13.0-52-generic #85-Ubuntu
> > [Fri Mar 24 20:30:40 2017] "echo 0 > 
> > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> > [Fri Mar 24 20:30:40 2017] jbd2/vda1-8     D ffff88043fd13180     0
>  226      2 0x00000000
> > [Fri Mar 24 20:30:40 2017]  ffff88003728bbd8 0000000000000046
> ffff880426900000 ffff88003728bfd8
> > [Fri Mar 24 20:30:40 2017]  0000000000013180 0000000000013180
> ffff880426900000 ffff88043fd13a18
> > [Fri Mar 24 20:30:40 2017]  ffff88043ffb9478 0000000000000002
> ffffffff811ef7c0 ffff88003728bc50
> > [Fri Mar 24 20:30:40 2017] Call Trace:
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff811ef7c0>] ?
> generic_block_bmap+0x50/0x50
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff81726d2d>] io_schedule+0x9d/0x140
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff811ef7ce>] sleep_on_buffer+0xe/0x20
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff817271b2>] __wait_on_bit+0x62/0x90
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff811ef7c0>] ?
> generic_block_bmap+0x50/0x50
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff81727257>]
> out_of_line_wait_on_bit+0x77/0x90
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff810ab180>] ?
> autoremove_wake_function+0x40/0x40
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff811f0afa>]
> __wait_on_buffer+0x2a/0x30
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8128bb4d>] jbd2_journal_commit_
> transaction+0x185d/0x1ab0
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff810755df>] ?
> try_to_del_timer_sync+0x4f/0x70
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8128fe7d>] kjournald2+0xbd/0x250
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff810ab140>] ?
> prepare_to_wait_event+0x100/0x100
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8128fdc0>] ?
> commit_timeout+0x10/0x10
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8108b5d2>] kthread+0xd2/0xf0
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8108b500>] ?
> kthread_create_on_node+0x1c0/0x1c0
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8173304c>] ret_from_fork+0x7c/0xb0
> > [Fri Mar 24 20:30:40 2017]  [<ffffffff8108b500>] ?
> kthread_create_on_node+0x1c0/0x1c0
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


Hi,

We are using these settings on hypervisors in openstack:
vm.dirty_ratio = 40
vm.dirty_background_ratio = 5

And these on vms:
vm.dirty_ratio = 10
vm.dirty_background_ratio = 5

In our case it prevents vms from crashing.

-- 
Marius Vaitiekūnas

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] disk timeouts in libvirt/qemu VMs...

Reply via email to