Quoting Fiona Ebner (2025-12-10 13:19:11) > As reported in the community forum [0] and then later by Thomas, > who provided the relevant system logs, parallel migration with > '--with-conntrack-state' of multiple VMs may currently lead to a > crash upon handover: > > > kvm: Unknown savevm section or instance 'dbus-vmstate/dbus-vmstate' 0. > > Make sure that your current VM setup matches your saved VM setup, > > including any hotplugged devices > > kvm: load of migration failed: Invalid argument > > In particular, the following sequence (on my test node) > > pvesh create /nodes/pve9a1/qemu/104/dbus-vmstate --action start > pvesh create /nodes/pve9a1/qemu/105/dbus-vmstate --action start > pvesh create /nodes/pve9a1/qemu/105/dbus-vmstate --action stop > > results in the wrong service being shut down (note the unexpected ID > in the last line!): > > Dec 10 10:07:40 pve9a1 pvesh[30453]: starting dbus-vmstate helper for VM 104 > Dec 10 10:07:40 pve9a1 systemd[1]: Starting [email protected] - > PVE DBus VMState Helper (VM 104)... > Dec 10 10:07:41 pve9a1 dbus-vmstate[30456]: pve-vmstate-104 listening on :1.55 > Dec 10 10:07:41 pve9a1 systemd[1]: Started [email protected] - PVE > DBus VMState Helper (VM 104). > Dec 10 10:07:44 pve9a1 pvesh[30511]: starting dbus-vmstate helper for VM 105 > Dec 10 10:07:44 pve9a1 systemd[1]: Starting [email protected] - > PVE DBus VMState Helper (VM 105)... > Dec 10 10:07:45 pve9a1 dbus-vmstate[30573]: pve-vmstate-105 listening on :1.58 > Dec 10 10:07:45 pve9a1 systemd[1]: Started [email protected] - PVE > DBus VMState Helper (VM 105). > Dec 10 10:07:48 pve9a1 pvesh[30595]: stopping dbus-vmstate helper for VM 105 > Dec 10 10:07:48 pve9a1 dbus-vmstate[30456]: shutting down gracefully .. > Dec 10 10:07:48 pve9a1 systemd[1]: [email protected]: Deactivated > successfully. > > So the dbus-vmstate object is removed from the wrong VM before loading > the migration state. Note that the crash is still racy, because if the > dbus-vmstate is removed on the source side for the same wrong VM before > the migration handover, the QEMU objects for both instances will still > match. > > To fix the issue, introduce a dbus_call_method() helper similar to the > already existing dbus_get_property() one. Like, this the owner is > respected even if there are multiple (queued) owners on the DBus. > > [0]: https://forum.proxmox.com/threads/176821/post-820775 > > Reported-by: Thomas Lamprecht <[email protected]> > Signed-off-by: Fiona Ebner <[email protected]>
Reviewed-by: Fabian Grünbichler <[email protected]> dbus_get_property can now effectively become a thin wrapper around the new helper... > --- > > Changes in v2: > * Introduce a helper for calling dbus methods which respects the owner > > src/PVE/QemuServer/DBusVMState.pm | 27 ++++++++++++++++++++++++++- > 1 file changed, 26 insertions(+), 1 deletion(-) > > diff --git a/src/PVE/QemuServer/DBusVMState.pm > b/src/PVE/QemuServer/DBusVMState.pm > index a72d6dd2..f1766035 100644 > --- a/src/PVE/QemuServer/DBusVMState.pm > +++ b/src/PVE/QemuServer/DBusVMState.pm > @@ -39,6 +39,30 @@ my sub dbus_get_property { > return $reply[0]; > } > > +# Call a method for an object from a specific interface name. > +# In contrast to calling the method directly by using $obj->Method(), this > +# actually respects the owner of the object and thus can be used for > interfaces > +# with might have multiple (queued) owners on the DBus. > +my sub dbus_call_method { > + my ($obj, $interface, $method, $params, $timeout) = @_; > + > + $timeout = 10 if !$timeout; > + > + my $con = $obj->{service}->get_bus()->get_connection(); > + > + my $call = $con->make_method_call_message( > + $obj->{service}->get_service_name(), > + $obj->{object_path}, > + $interface, > + $method, > + ); > + > + $call->set_destination($obj->get_service()->get_owner_name()); > + $call->append_args_list($params->@*) if $params; > + > + return $con->send_with_reply_and_block($call, $timeout * > 1000)->get_args_list(); > +} > + > # Starts the dbus-vmstate helper D-Bus service daemon and adds the needed > # object to the appropriate QEMU instance for the specified VM. > sub qemu_add_dbus_vmstate { > @@ -114,7 +138,8 @@ sub qemu_del_dbus_vmstate { > $num_entries = eval { > dbus_get_property($object, 'com.proxmox.VMStateHelper', > 'NumMigratedEntries'); > }; > - eval { $object->Quit() }; > + # Quit() does QMP object-del which has a timeout of 60 seconds > + eval { dbus_call_method($object, 'com.proxmox.VMStateHelper', > 'Quit', [], 70); }; > if (my $err = $@) { > syslog('warn', "failed to call quit on dbus-vmstate for VM > $vmid: $err\n") > if !$params{quiet}; > -- > 2.47.3 > > > > _______________________________________________ > pve-devel mailing list > [email protected] > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > _______________________________________________ pve-devel mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
