Public bug reported: Discovered in a environment that was configured with
[libvirt] images_type = raw only, the other relevant options were at their defaults (use_cow_images = True, force_raw_images = False). Symptom - the instances were non-responsive and non running after cold migration (e.g. no console log at all), live migration works fine. Workaround - setting use_cow_images=False and force_raw_images=True solved the problem. Reproduction on a current multinode devstack: 1. Configure computes as described above - set [libvirt]images_type = raw, leave the rest per default devstack / nova settings. 2. Create a raw image in Glance. 3. Boot an instance from that raw image. 4. Inspect the image on the file system - the image is in fact raw. 5. Cold-migrate the server. 6. Migration finishes successfully, instance is reported as up and running on the new host - but in fact it has completely failed to start (not accessible, no console log, nothing). 7. If you check the image file nova uses on the new compute - it is now qcow2, not raw. 8. But the libvirt XML of the instance still defines the disk as raw! Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <devices> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <disk type="file" device="disk"> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <driver name="qemu" type="raw" cache="none"/> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <source file="/opt/stack/data/nova/instances/22749d77-83a1-4ae9-ade8-7bd9548406cd/disk"/> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <target dev="vda" bus="virtio"/> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: </disk> Stopping the instance and manually converting the disk back to raw allows instance to start properly. I tracked it down to this place in finish_migration method: https://opendev.org/openstack/nova/src/branch/stable/2023.2/nova/virt/libvirt/driver.py#L11739 if (disk_name != 'disk.config' and info['type'] == 'raw' and CONF.use_cow_images): self._disk_raw_to_qcow2(info['path']) Effectively, nova changes disk type but not changing the XML appropriately to reflect the actual new disk format, and thus the instance fails to start. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2038898 Title: image format change during migration is not reflected in libvirt XML Status in OpenStack Compute (nova): New Bug description: Discovered in a environment that was configured with [libvirt] images_type = raw only, the other relevant options were at their defaults (use_cow_images = True, force_raw_images = False). Symptom - the instances were non-responsive and non running after cold migration (e.g. no console log at all), live migration works fine. Workaround - setting use_cow_images=False and force_raw_images=True solved the problem. Reproduction on a current multinode devstack: 1. Configure computes as described above - set [libvirt]images_type = raw, leave the rest per default devstack / nova settings. 2. Create a raw image in Glance. 3. Boot an instance from that raw image. 4. Inspect the image on the file system - the image is in fact raw. 5. Cold-migrate the server. 6. Migration finishes successfully, instance is reported as up and running on the new host - but in fact it has completely failed to start (not accessible, no console log, nothing). 7. If you check the image file nova uses on the new compute - it is now qcow2, not raw. 8. But the libvirt XML of the instance still defines the disk as raw! Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <devices> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <disk type="file" device="disk"> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <driver name="qemu" type="raw" cache="none"/> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <source file="/opt/stack/data/nova/instances/22749d77-83a1-4ae9-ade8-7bd9548406cd/disk"/> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: <target dev="vda" bus="virtio"/> Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]: </disk> Stopping the instance and manually converting the disk back to raw allows instance to start properly. I tracked it down to this place in finish_migration method: https://opendev.org/openstack/nova/src/branch/stable/2023.2/nova/virt/libvirt/driver.py#L11739 if (disk_name != 'disk.config' and info['type'] == 'raw' and CONF.use_cow_images): self._disk_raw_to_qcow2(info['path']) Effectively, nova changes disk type but not changing the XML appropriately to reflect the actual new disk format, and thus the instance fails to start. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2038898/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp