Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Hall, Eric
http://tracker.ceph.com/issues/20393 created with supporting logs/info noted. -- Eric On 6/23/17, 7:54 AM, "Jason Dillaman" wrote: On Fri, Jun 23, 2017 at 8:47 AM, Hall, Eric wrote: > I have debug logs. Should I open a RBD tracker ticket at http://tracker.ceph.com

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Hall, Eric
s) layering? On Fri, Jun 23, 2017 at 1:46 AM, Hall, Eric wrote: > The problem seems to be reliably reproducible after a fresh reboot of the VM… > > With this knowledge, I can cause the hung IO condition while having noscrub and nodeepscrub set.

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Hall, Eric
[1] http://tracker.ceph.com/issues/20041 On Wed, Jun 21, 2017 at 3:33 PM, Hall, Eric wrote: > The VMs are using stock Ubuntu14/16 images so yes, there is the default “/sbin/fstrim –all” in /etc/cron.weekly/fstrim. > > --

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-22 Thread Hall, Eric
me or many of your VMs issuing periodic fstrims to discard > unused extents? > > On Wed, Jun 21, 2017 at 2:36 PM, Hall, Eric wrote: > > After following/changing all suggested items (turning off exclusive-lock > > (and associated object-map a

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-21 Thread Hall, Eric
n 21, 2017 at 2:36 PM, Hall, Eric wrote: > After following/changing all suggested items (turning off exclusive-lock > (and associated object-map and fast-diff), changing host cache behavior, > etc.) this is still a blocking issue for many uses of our OpenStack/Ceph

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-21 Thread Hall, Eric
sure removing them prevents the issue) I hope this works for you (and maybe gets some attention from devs too), so you don't waste months like me. On 03/27/17 19:31, Hall, Eric wrote: > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > using libvirt/qem

[ceph-users] disk timeouts in libvirt/qemu VMs...

2017-03-27 Thread Hall, Eric
In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and ceph hosts, we occasionally see hung processes (usually during boot, but otherwise as well), with errors reported in the instance logs as shown below.