On 2/20/26 3:30 PM, Fiona Ebner wrote:
Am 20.02.26 um 10:36 AM schrieb Dominik Csapak:
On 2/19/26 2:27 PM, Fiona Ebner wrote:
Am 19.02.26 um 11:15 AM schrieb Dominik Csapak:
On 2/16/26 10:15 AM, Fiona Ebner wrote:
Am 16.02.26 um 9:42 AM schrieb Fabian Grünbichler:
On February 13, 2026 2:16 pm, Fiona Ebner wrote:

I guess the actual need is to have more consistent behavior.


ok so i think we'd need to
* create a cleanup flag for each vm when qmevent detects a vm shutting
down (in /var/run/qemu-server/VMID.cleanup, possibly with timestamp)
* removing that cleanup flag after cleanup (obviously)
* on start, check for that flag and block for some timeout before
starting (e.g. check the timestamp in the flag if it's longer than some
time, start it regardless?)

Sounds good to me.

Unfortunately, something else: turns out that we kinda rely on qmeventd
not doing the cleanup for the optimization with keeping the volumes
active (i.e. $keepActive). And actually, the optimization applies
randomly depending on who wins the race.

Output below with added log line
"doing cleanup for $vmid with keepActive=$keepActive"
in vm_stop_cleanup() to be able to see what happens.

We try to use the optimization but qmeventd interferes:

Feb 19 14:09:43 pve9a1 vzdump[168878]: <root@pam> starting task
UPID:pve9a1:000293AF:0017CFF8:69970B97:vzdump:102:root@pam:
Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: starting new backup job:
vzdump 102 --storage pbs --mode stop
Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: Starting Backup of VM
102 (qemu)
Feb 19 14:09:44 pve9a1 qm[168960]: shutdown VM 102:
UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam:
Feb 19 14:09:44 pve9a1 qm[168959]: <root@pam> starting task
UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam:
Feb 19 14:09:47 pve9a1 qm[168960]: VM 102 qga command failed - VM 102
qga command 'guest-ping' failed - got timeout
Feb 19 14:09:50 pve9a1 qmeventd[166736]: read: Connection reset by peer
Feb 19 14:09:50 pve9a1 pvedaemon[166884]: <root@pam> end task
UPID:pve9a1:000290CD:0017B515:69970B52:vncproxy:102:root@pam: OK
Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Deactivated successfully.
Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Consumed 41.780s CPU
time, 1.9G memory peak.
Feb 19 14:09:51 pve9a1 qm[168960]: doing cleanup for 102 with
keepActive=1
Feb 19 14:09:51 pve9a1 qm[168959]: <root@pam> end task
UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam: OK
Feb 19 14:09:51 pve9a1 qmeventd[168986]: Starting cleanup for 102
Feb 19 14:09:51 pve9a1 qm[168986]: doing cleanup for 102 with
keepActive=0
Feb 19 14:09:51 pve9a1 qmeventd[168986]: Finished cleanup for 102
Feb 19 14:09:51 pve9a1 systemd[1]: Started 102.scope.
Feb 19 14:09:51 pve9a1 vzdump[168879]: VM 102 started with PID 169021.

We manage to get the optimization:

Feb 19 14:16:01 pve9a1 qm[174585]: shutdown VM 102:
UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam:
Feb 19 14:16:04 pve9a1 qm[174585]: VM 102 qga command failed - VM 102
qga command 'guest-ping' failed - got timeout
Feb 19 14:16:07 pve9a1 qmeventd[166736]: read: Connection reset by peer
Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Deactivated successfully.
Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Consumed 46.363s CPU
time, 2G memory peak.
Feb 19 14:16:08 pve9a1 qm[174585]: doing cleanup for 102 with
keepActive=1
Feb 19 14:16:08 pve9a1 qm[174582]: <root@pam> end task
UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam: OK
Feb 19 14:16:08 pve9a1 systemd[1]: Started 102.scope.
Feb 19 14:16:08 pve9a1 qmeventd[174685]: Starting cleanup for 102
Feb 19 14:16:08 pve9a1 qmeventd[174685]: trying to acquire lock...
Feb 19 14:16:08 pve9a1 vzdump[174326]: VM 102 started with PID 174718.
Feb 19 14:16:08 pve9a1 qmeventd[174685]:  OK
Feb 19 14:16:08 pve9a1 qmeventd[174685]: vm still running

For regular shutdown, we'll also do the cleanup twice.

Maybe we also need a way to tell qmeventd that we already did the
cleanup?


ok well then i'd try to do something like this:

in

'vm_stop' we'll create a cleanup flag with timestamp + state (e.g.
'queued')

in vm_stop_cleanup we change/create the flag with
'started' and clear the flag after cleanup

Why is the one in vm_stop needed? Is there any advantage over creating
it directly in vm_stop_cleanup()?


after a bit of experimenting and re-reading the code, i think
I can simplify the logic

at the beginning of vm_stop, we create the cleanup flag
in 'qm cleanup', we only do the cleanup if the flag does not exist
in 'vm_start' we clean the flag

this should work because these parts are under a config lock anyway:
* from vm_stop to vm_stop_cleanup
* most of the qm cleanup code
* vm_start

so we only really have to mark that the cleanup was done from
the vm_stop codepath

(we have to create the flag at the beginning of vm_stop, because
then there is no race between calling it's cleanup and qmeventd
picking up the vanishing process)

does that make sense to you?

(if it's here already in 'started' state within a timelimit, ignore it)

in vm_start we block until the cleanup flag is gone or until some timeout

in 'qm cleanup' we only start it if the flag does not exist

Hmm, it does also call vm_stop_cleanup() so we could just re-use the
check there for that part? I guess doing an early check doesn't hurt
either, as long as we do call the post-stop hook.

I think this should make the behavior consistent?






Reply via email to