Bug#842370: qemu qmp unix sockets stop working with 'connection refused' randomly

Brad Barnett Mon, 31 Oct 2016 01:51:41 -0700

So, here's the skinny.

Lots of background, but I have a series of scripts.  One starts/stops a
single qemu instance, and another calls that to start/stop all VMs.

The "all start/stop VMs" script is used on boot or shutdown, called by an
init script, and the single script does a variety of things -- like send
ACPI shutdown, waiting, etc, etc.

Anyhow when scripting these things, I noticed this behaviour in qemu if I
attempt to start a running qemu:

$ kvm <lots of command line args plus..> -pid pidfile.pid
Could not acquire pid file: Resource temporarily unavailable

So, I skipped past the "check if qemu" is running phase, and left the job
of not starting a running VM to qemu.  It should, after all, be best at
this.

(This was my thought process.)

However whenever qemu starts it opens / overwrites the socket, THEN sees
the PID file is locked, then closes the socket + exits.

And, you are then left with a running qemu which has no usable socket any
more.

The bare metal layer servers we're using, have both production and
development VM instances.  And, those development instances do get
stopped and started much more often.

Meaning, it is often convenient to stop individual dev VMs one at a time,
but start them all back up at once.

And then the next day, or next week I'd notice the qmp socket was MIA.

So, there's my story.  I hope you enjoyed it.  I'll be at the concierge's
office while the police arrive.

On Fri, 28 Oct 2016 17:53:00 +0300
Michael Tokarev <m...@tls.msk.ru> wrote:

> Control: tag -1 moreinfo unreproducible
> 
> 28.10.2016 17:30, Brad Barnett wrote:
> > Package: qemu-system-x86
> > Version: 1:2.1+dfsg-12+deb8u6
> > 
> > I've been unable to find anything about this via Google searches,
> > mailing list searches, IRC, you name it.  Nor, does the documentation
> > seem to indicate I'm doing anything wrong.
> > 
> > Additional queries to qemu-devel and qemu-discuss fell flat, with zero
> > replies.
> > 
> > I'm starting qemu with various options, including this:
> > 
> > -qmp unix:/path_to_sock_dir/box.uniquename.sock,server,nowait
> > 
> > I'm using socat to connect:
> > 
> > socat UNIX-CONNECT:/path_to_sock_dir/box.uniquename.sock STDIO
> []
> > What happens is that randomly, socat can no longer connect to the
> > socket in question.  This has happened after successful uses of that
> > very socket, and also if I've never used that socket before.
> 
> So, when socat isn't able to connect anymore, what does
> 
>  lsof /path_to_sock_dir/box.uniquename.sock
> 
> says about it?  Does the socket exist at all?
> 
> There's no code in qemu to remove the socket or to mess with it in
> any other way during runtime, and has never been.
> 
> The same code is used for all other unix sockets too, be it
> monitor (hmp), vnc or other things.
> 
> If the socket is being deleted or moved, it is done outside
> qemu.  For example, you can use mv(1) or rm(1) tool to do so,
> and socat will return exactly the error you mentioned above.
> 
> If you try to start another qemu with the same socket, that
> socket will be overridden too.
> 
> If you exit qemu listening on that socket, the socket will
> refuse connections as well.
> 
> I myself used various sokets in qemu, including qmp, countless
> number of times, and right now all my VMs run with qmp unix
> socket enabled.  Libvirt, the main qemu management interface,
> uses it too.
> 
> So I'm really unsure what's going on here. It looks like
> something specific to your environment, please check the
> above variants.
> 
> Thanks,
> 
> /mjt

Bug#842370: qemu qmp unix sockets stop working with 'connection refused' randomly

Reply via email to