Hello, All! We observe the following behavior for QEMU configured by libvirt to use guest agent as usual for the guest without virtio-serial driver (Windows or the guest remaining in BIOS stage).
In QEMU on first connect to listen character device socket the listen socket is removed from poll just after the accept(). virtio_serial_guest_ready() returns 0 and the descriptor of the connected Unix socket is removed from poll and it will not be present in poll() until the guest will initialize the driver and change the state of the serial to "guest connected". In libvirt connect() to guest agent is performed on restart and is run under VM state lock. Connec() is blocking and can wait forever as - accept queue in QEMU is not polled - it will exhaust sooner or later. In this case libvirt can not perform ANY operation on that VM. Weird! The problem should be addressed from both sides IMHO, as it is bad to keep stale connection in QEMU. IMHO we should tweak io_watch_poll_prepare() to register for G_IO_ERR | G_IO_HUP | G_IO_NVAL even if the read is disable and handle connection closing gracefully. Though there are some questions remaining. What should we do with the data in the socket queue? Are we free to drop it as the requester has been left? In the other case we should fix each qemu client including libvirtd. Alternatively we can still poll listen socket and close any connections if there is active one. Any opinion? Den P.S. Quick reproducer VM config, as usual <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </controller> <channel type='unix'> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> Quick reproducer: strace socat - UNIX-CONNECT:/var/lib/libvirt/qemu/channel/target/domain-20-vm1/org.qemu.guest_agent.0 ^C (interrupt) strace socat - UNIX-CONNECT:/var/lib/libvirt/qemu/channel/target/domain-20-vm1/org.qemu.guest_agent.0 ^C (interrupt) strace socat - UNIX-CONNECT:/var/lib/libvirt/qemu/channel/target/domain-20-vm1/org.qemu.guest_agent.0 ^C (interrupt) strace socat - UNIX-CONNECT:/var/lib/libvirt/qemu/channel/target/domain-20-vm1/org.qemu.guest_agent.0 ^C (interrupt) Normal behavior of the 'socat' is passed connect() routine and waiting on select(). Wrong behavior - hang in 'connect'.