Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

Oliver Francke Sun, 04 Aug 2013 06:37:48 -0700

Hi Mike,

you might be the guy StefanHa was referring to on the qemu-devel mailing-list.


I just made some more tests, so…

Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.daw...@cloudapt.com>:

> Oliver,
> 
> We've had a similar situation occur. For about three months, we've run 
> several Windows 2008 R2 guests with virtio drivers that record video 
> surveillance. We have long suffered an issue where the guest appears to hang 
> indefinitely (or until we intervene). For the sake of this conversation, we 
> call this state "wedged", because it appears something (rbd, qemu, virtio, 
> etc) gets stuck on a deadlock. When a guest gets wedged, we see the following:
> 
> - the guest will not respond to pings

If showing up the hung_task - message, I can ping and establish new 
ssh-sessions, just the session with a while loop does not accept any 
keyboard-action.

> - the qemu-system-x86_64 process drops to 0% cpu
> - graphite graphs show the interface traffic dropping to 0bps
> - the guest will stay wedged forever (or until we intervene)
> - strace of qemu-system-x86_64 shows QEMU is making progress [1][2]
> 

nothing special here:

5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, 
events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 
([{fd=12, revents=POLLIN}])
[pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily 
unavailable)
[pid 11793] read(12, 
"\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 
69632) = 115
[pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily 
unavailable)
[pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, 
events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, 
events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd=

and that for many, many threads.
Inside the VM I see 75% wait, but I can restart the spew-test in a second 
session.

All that tested with rbd_cache=false,cache=none.

I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some 
high load, encountering no problem ATM. Running smooth and fast.

> We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh 
> screenshot' command. After that, the guest resumes and runs as expected. At 
> that point we can examine the guest. Each time we'll see:
> 
> - No Windows error logs whatsoever while the guest is wedged
> - A time sync typically occurs right after the guest gets un-wedged
> - Scheduled tasks do not run while wedged
> - Windows error logs do not show any evidence of suspend, sleep, etc
> 
> We had so many issue with guests becoming wedged, we wrote a script to 'virsh 
> screenshot' them via cron. Then we installed some updates and had a month or 
> so of higher stability (wedging happened maybe 1/10th as often). Until today 
> we couldn't figure out why.
> 
> Yesterday, I realized qemu was starting the instances without specifying 
> cache=writeback. We corrected that, and let them run overnight. With RBD 
> writeback re-enabled, wedging came back as often as we had seen in the past. 
> I've counted ~40 occurrences in the past 12-hour period. So I feel like 
> writeback caching in RBD certainly makes the deadlock more likely to occur.
> 
> Joshd asked us to gather RBD client logs:
> 
> "joshd> it could very well be the writeback cache not doing a callback at 
> some point - if you could gather logs of a vm getting stuck with debug rbd = 
> 20, debug ms = 1, and debug objectcacher = 30 that would be great"
> 
> We'll do that over the weekend. If you could as well, we'd love the help!
> 
> [1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
> [2] http://www.gammacode.com/kvm/not-wedged.txt
> 

As I wrote above, no cache so far, so omitting the verbose debugging at the 
moment. But will do if requested.

Thnx for your report,

Oliver.

> Thanks,
> 
> Mike Dawson
> Co-Founder & Director of Cloud Architecture
> Cloudapt LLC
> 6330 East 75th Street, Suite 170
> Indianapolis, IN 46250
> 
> On 8/2/2013 6:22 AM, Oliver Francke wrote:
>> Well,
>> 
>> I believe, I'm the winner of buzzwords-bingo for today.
>> 
>> But seriously speaking... as I don't have this particular problem with
>> qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
>> alone here?
>> We have a raising number of tickets from people reinstalling from ISO's
>> with 3.2-kernel.
>> 
>> Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
>> some features ala latency-free-RBD-cache ;)
>> 
>> I just opened a bug for qemu per:
>> 
>> https://bugs.launchpad.net/qemu/+bug/1207686
>> 
>> with all dirty details.
>> 
>> Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
>> "fixes" it. So we have a bad combination for all distros with 3.2-kernel
>> and rbd as storage-backend, I assume.
>> 
>> Any similar findings?
>> Any idea of tracing/debugging ( Josh? ;) ) very welcome,
>> 
>> Oliver.
>> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

Reply via email to