https://review.opendev.org/c/openstack/nova/+/860739 was abandoned as stable/train was changed to unmaintained/train
** Changed in: nova/train Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1982284 Title: libvirt live migration sometimes fails with "libvirt.libvirtError: internal error: migration was active, but no RAM info was set" Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive ussuri series: New Status in Ubuntu Cloud Archive victoria series: New Status in Ubuntu Cloud Archive wallaby series: New Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) train series: Won't Fix Status in OpenStack Compute (nova) ussuri series: Won't Fix Status in OpenStack Compute (nova) victoria series: Won't Fix Status in OpenStack Compute (nova) wallaby series: Won't Fix Status in OpenStack Compute (nova) xena series: Fix Released Status in OpenStack Compute (nova) yoga series: Fix Released Status in OpenStack Compute (nova) zed series: Fix Released Bug description: We have seen this downstream where live migration randomly fails with the following error [1]: libvirt.libvirtError: internal error: migration was active, but no RAM info was set Discussion on [1] gravitated toward a possible race condition issue in qemu around the query-migrate command [2]. The query-migrate command is used (indirectly) by the libvirt driver during monitoring of live migrations [3][4][5]. While searching for info about this error, I found a thread on libvir- list from the past [6] where someone else encountered the same error and for them it happened if they called query-migrate *after* a live migration had completed. Based on this, it seemed possible that our live migration monitoring thread sometimes races and calls jobStats() after the migration has completed, resulting in this error being raised and the migration being considered failed when it was actually complete. A patch has since been proposed and committed [7] to address the possible issue. Meanwhile, on our side in nova, we can mitigate this problematic behavior by catching the specific error from libvirt and ignoring it so that a live migration in this situation will be considered completed by the libvirt driver. Doing this would improve the experience for users that are hitting this error and getting erroneous live migration failures. [1] https://bugzilla.redhat.com/show_bug.cgi?id=2074205 [2] https://qemu.readthedocs.io/en/latest/interop/qemu-qmp-ref.html#qapidoc-1848 [3] https://github.com/openstack/nova/blob/bcb96f362ab12e297f125daa5189fb66345b4976/nova/virt/libvirt/driver.py#L10123 [4] https://github.com/openstack/nova/blob/bcb96f362ab12e297f125daa5189fb66345b4976/nova/virt/libvirt/guest.py#L655 [5] https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainGetJobStats [6] https://listman.redhat.com/archives/libvir-list/2021-January/213631.html [7] https://github.com/qemu/qemu/commit/552de79bfdd5e9e53847eb3c6d6e4cd898a4370e To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1982284/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp