On Thu, Sep 18, 2014 at 03:21:08PM +0400, Andrey Korolyov wrote: > On Thu, Sep 18, 2014 at 2:49 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > > On Wed, Sep 17, 2014 at 11:56:57PM +0400, Andrey Korolyov wrote: > >> I`ve faced an issue with qemu VMs with very large uptime spans - half > >> of year or so. They are hanging in running state forever and are not > >> killable in any imaginable fashion. Tried to freeze it via freezer cg > >> without any luck. VM itself went unresponsive with zero cpu > >> consumption after reaching 'forever running' point. > >> > >> I am going to reset the host in a couple of hours, so any timed ideas > >> for debugging this state will be very appreciated. > > > > A couple of shots at figuring out what the process is doing: > > > > cat /proc/$PID/stack > > cat /proc/$PID/syscall > > gdb $PID > > (gdb) thread apply all bt > > Thanks Stefan, > > of course any attempts to attach to the process or dump core failed at > very beginning. I compared proc contents with live VM and found > nothing suspicious. The question is about what I should try to do > facing supposedly kernel bug, if no possibility to determine which > code is currently executing by emulator is available. Also if it may > help, both affected VMs on different hosts has a simular process > uptime (from end of May). Just to repeat - the process is not reacting > to any signal, have zero CPU consumption immediately after bug > appearance and therefore cannot be stopped/frozen.
What did cat /proc/$PID/stack and cat /proc/$PID/syscall output? Stefan
pgpA6i5Ug18ps.pgp
Description: PGP signature