Dear list, I have a Debian bookworm host where I run three VMs using Qemu / KVM. Each VM has a monitor socket that I use to manage the VM while it is running. The main usage for the sockets is shutting down the VMs. One VM runs Debian bookworm as well, the other two VMs run Windows Server 2022 Standard. In the Windows VMs, I have installed the VirtIO guest tools for Windows (version 0.1.240).
The problem is that the monitor sockets for all three VMs are working fine for a while after the VMs have been started, but later the two sockets that belong to the Windows VMs seem to fail. The failure symptom is that these two sockets do not react to connections any more. Neither do they throw any error nor do they output anything else. Instead, every connection to them just hangs forever until it is ended forcibly. In contrast, the socket that belongs to the Linux VM still works (but this may be due to random). This situation is very bad because the monitor sockets are used on our servers to cleanly shutdown the VMs in case of power outages, before the host itself is shut down and the UPS cuts the power. After a couple of days of testing, I am now completely out of ideas and would like to ask for possible explanations and solutions. Host software versions: - Kernel 6.1.0-23-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.99-1 (2024-07-15) x86_64 GNU/Linux - Qemu 7.2+dfsg-7+deb12u6 Snippet of the command line that creates the sockets (example for one of the VMs): /usr/bin/qemu-system-x86_64 \ .... .... -monitor none \ -chardev socket,path=/vm/sb-vm-modelmgr.sock,id=mon-sb-vm-modelmgr,server=on,wait=off,abstract=off \ -mon chardev=mon-sb-vm-modelmgr,mode=readline .... .... Command line to shut down that VM (can be entered directly in a shell for testing, but actually is part of a bash script): printf "%s\n" 'system_powerdown' | socat - unix-connect:/vm/sb-vm-modelmgr.sock If I enter this command in a shell shortly after I have started the respective VM, it works as intended. But if I start that VM, wait for quite a while and then enter the command, the command hangs indefinitely and there is no output. As mentioned above, this happens with the two Windows VM, but not with the Linux VM. I have not yet found out exactly how long it takes until the sockets start to fail. To verify that I am not dealing with a broken socat, I have replaced socat by netcat for testing. The outcome was the same. I am nearly sure that I did not experience that problem in Debian buster (featuring Qemu 3.1+dfsg-8+deb10u12). Is there a known bug in Qemu 7.2 (compared to 3.1) that could cause the problem? Thank you very much in advance for any advice, and best regards, Binarus