Public bug reported:

Discovered in a environment that was configured with

[libvirt]
images_type = raw

only, the other relevant options were at their defaults (use_cow_images
= True, force_raw_images = False).

Symptom - the instances were non-responsive and non running after cold 
migration (e.g. no console log at all), live migration works fine.
Workaround - setting use_cow_images=False and force_raw_images=True solved the 
problem.

Reproduction on a current multinode devstack:

1. Configure computes as described above - set [libvirt]images_type = raw, 
leave the rest per default devstack / nova settings.
2. Create a raw image in Glance.
3. Boot an instance from that raw image.
4. Inspect the image on the file system - the image is in fact raw.
5. Cold-migrate the server.
6. Migration finishes successfully, instance is reported as up and running on 
the new host - but in fact it has completely failed to start (not accessible, 
no console log, nothing).
7. If you check the image file nova uses on the new compute - it is now qcow2, 
not raw.
8. But the libvirt XML of the instance still defines the disk as raw!

Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:   <devices>
Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:     <disk 
type="file" device="disk">
Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:       <driver 
name="qemu" type="raw" cache="none"/>
Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:       <source 
file="/opt/stack/data/nova/instances/22749d77-83a1-4ae9-ade8-7bd9548406cd/disk"/>
Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:       <target 
dev="vda" bus="virtio"/>
Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:     </disk>

Stopping the instance and manually converting the disk back to raw
allows instance to start properly.

I tracked it down to this place in finish_migration method:

https://opendev.org/openstack/nova/src/branch/stable/2023.2/nova/virt/libvirt/driver.py#L11739

            if (disk_name != 'disk.config' and
                        info['type'] == 'raw' and CONF.use_cow_images):
                self._disk_raw_to_qcow2(info['path'])

Effectively, nova changes disk type but not changing the XML
appropriately to reflect the actual new disk format, and thus the
instance fails to start.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2038898

Title:
  image format change during migration is not reflected in libvirt XML

Status in OpenStack Compute (nova):
  New

Bug description:
  Discovered in a environment that was configured with

  [libvirt]
  images_type = raw

  only, the other relevant options were at their defaults
  (use_cow_images = True, force_raw_images = False).

  Symptom - the instances were non-responsive and non running after cold 
migration (e.g. no console log at all), live migration works fine.
  Workaround - setting use_cow_images=False and force_raw_images=True solved 
the problem.

  Reproduction on a current multinode devstack:

  1. Configure computes as described above - set [libvirt]images_type = raw, 
leave the rest per default devstack / nova settings.
  2. Create a raw image in Glance.
  3. Boot an instance from that raw image.
  4. Inspect the image on the file system - the image is in fact raw.
  5. Cold-migrate the server.
  6. Migration finishes successfully, instance is reported as up and running on 
the new host - but in fact it has completely failed to start (not accessible, 
no console log, nothing).
  7. If you check the image file nova uses on the new compute - it is now 
qcow2, not raw.
  8. But the libvirt XML of the instance still defines the disk as raw!

  Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:   <devices>
  Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:     <disk 
type="file" device="disk">
  Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:       <driver 
name="qemu" type="raw" cache="none"/>
  Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:       <source 
file="/opt/stack/data/nova/instances/22749d77-83a1-4ae9-ade8-7bd9548406cd/disk"/>
  Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:       <target 
dev="vda" bus="virtio"/>
  Oct 09 12:15:35 pshchelo-devstack-jammy nova-compute[427994]:     </disk>

  Stopping the instance and manually converting the disk back to raw
  allows instance to start properly.

  I tracked it down to this place in finish_migration method:

  
https://opendev.org/openstack/nova/src/branch/stable/2023.2/nova/virt/libvirt/driver.py#L11739

              if (disk_name != 'disk.config' and
                          info['type'] == 'raw' and CONF.use_cow_images):
                  self._disk_raw_to_qcow2(info['path'])

  Effectively, nova changes disk type but not changing the XML
  appropriately to reflect the actual new disk format, and thus the
  instance fails to start.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2038898/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to