** Description changed:

  [ Impact ]
  
- TBD.
+ Virtualization users who:
+ 
+ - Have a Noble VM on top of an Intel bare metal machine, and
+ - Create a nested VM (guest) inside this Noble VM (host), and
+ - Try to migrate this nested VM (guest) to another Noble VM (host) running on 
the same bare metal machine or a similar one, and
+ - Use a "migratable XML" (as generated by "virsh dumpxml --migratable") as 
virsh's "--xml" and "--persistent-xml" arguments
+ 
+ might encounter issues which prevent the migration from starting.  These
+ issues are related to CPU feature checks performed by libvirt, more
+ specifically features related to "vmx*", which unfortunately have been a
+ known source of problems in migration scenarios under libvirt.
+ 
+ This bug also affects users who created the migratable XML file under
+ Noble and are now trying to use it with the libvirt shipped in Oracular.
  
  [ Test Plan ]
  
  Even though this problem happens only when using nested VMs with Intel
  CPUs, it is still recommended to perform the following tests on a bare
  metal machine also with an Intel CPU.  In theory it should be possible
  to reproduce this on a host using an AMD CPU, but you'd have to
  explicitly tell LXD to create VMs with Intel CPUs.
  
  Credits to Guillaume Boutry for providing scripts automating most of the
  reproduction steps.
  
  Let's create two Noble VMs using LXD:
  
  $ lxc launch ubuntu:noble --vm --config limits.cpu=4 --config 
limits.memory=8GiB -d root,size=80GiB libvirt-1
  $ lxc launch ubuntu:noble --vm --config limits.cpu=4 --config 
limits.memory=8GiB -d root,size=80GiB libvirt-2
  
  You will need to generate an SSH keypair for the "ubuntu" user on
  libvirt-1 and install the public key on libvirt-2 so that "ssh
  libvirt-2.lxd" works.  The rest of this test plan assumes you have done
  that.
  
  Inside libvirt-1:
  
  # apt update
  # apt install -y libvirt-daemon-system uuid
  # echo "host_uuid = \"00000000-0000-0000-0000-$(printf "%012x" 
"${RANDOM}")\"" >> /etc/libvirt/libvirtd.conf
  # systemctl restart libvirtd.service
  # su - ubuntu
  $ cd /tmp
  $ wget 
http://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
  $ sudo chown libvirt-qemu:kvm noble-server-cloudimg-amd64.img
  $ cd
  $ cat > domain.xml << _EOF_
  <domain type="kvm">
    <uuid>$(uuidgen)</uuid>
    <name>test-domain</name>
    <memory>1048576</memory>
    <vcpu>2</vcpu>
    <os>
      <type arch="x86_64" machine="pc">hvm</type>
      <boot dev="hd"/>
    </os>
    <features>
      <acpi/>
      <apic/>
      <vmcoreinfo/>
    </features>
    <clock offset="utc">
      <timer name="pit" tickpolicy="delay"/>
      <timer name="rtc" tickpolicy="catchup"/>
      <timer name="hpet" present="no"/>
    </clock>
    <cpu mode="host-model" match="exact">
      <topology sockets="2" cores="1" threads="1"/>
    </cpu>
    <devices>
      <disk type="file" device="disk">
        <driver name="qemu" type="qcow2" cache="none"/>
        <source file="/tmp/noble-server-cloudimg-amd64.img"/>
        <target dev="vda" bus="virtio"/>
      </disk>
      <video>
        <model type="qxl"/>
      </video>
      <rng model="virtio">
        <backend model="random">/dev/urandom</backend>
      </rng>
      <controller type="usb" index="0" model="none"/>
      <memballoon model="virtio">
        <stats period="10"/>
      </memballoon>
    </devices>
  </domain>
  _EOF_
  $ virsh define domain.xml
  $ virsh start test-domain
  $ virsh dumpxml --migratable test-domain > migratable.xml
  
  Inside libvirt-2:
  
  # apt update
  # apt install -y libvirt-daemon-system
  # cd /tmp
  # wget 
http://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
  # chown libvirt-qemu:kvm noble-server-cloudimg-amd64.img
  # cd
  
  Now, back to libvirt-1, we are ready to test the migration:
  
  $ virsh migrate test-domain qemu+ssh://libvirt-2.lxd/system --live
  --persistent --undefinesource --copy-storage-inc --migrate-disks vda
  --persistent-xml migratable.xml --xml migratable.xml
  
  On Noble, you should see the following error:
  
  error: unsupported configuration: Target CPU feature count 28 does not
  match source 96
  
  [ Where problems could occur ]
  
- TBD.
+ As described below (in the "Other information" section), this SRU is
+ different between Noble and Oracular.
+ 
+ For Noble, the chances of regression are higher because it involves
+ updating a sizeable patch series before actually backporting the patches
+ to fix the bug.  Feature-wise, this update should not change anything,
+ and a review has been performed to make sure that, to the best of our
+ knowledge, no user-facing changes are introduced.
+ 
+ For Oracular, all that was needed to be done was backporting the patches
+ that fix the issue.
+ 
+ The patches themselves are not complex and have been part of RHEL's
+ libvirt for a while now, without any regressions.  There is always the
+ possibility that some unwanted regression is introduced, but our
+ internal migration testsuite has not caught any problems.
+ 
+ [ Other information ]
+ 
+ For Noble, this SRU involves two steps:
+ 
+ 1) Updating an existing patch series (which was backported in order to
+ fix bug #2051754).  This is needed because the patch series was
+ backported directly from the patches posted at upstream's mailing list.
+ The series has since been accepted and pushed to the upstream git
+ repository, and although it is exactly feature-wise, there were some
+ minor cosmetic changes done to function names which can affect future
+ backports that touch the same code (as is the case here).
+ 
+ 2) Actually backporting the patches that fix the issue.
+ 
+ Oracular was simpler because the patchset from step (1) was already
+ present at the release.
  
  [ Original Description ]
  
  This is issue is reproduced consistently from the snap-openstack-
  hypervisor built from
  https://git.launchpad.net/ubuntu/+source/libvirt@ubuntu/noble-updates
  (with patches applied).
  
  When creating a nova instance, live migrating between two hosts always fails 
because of:
  error: unsupported configuration: Target CPU feature count 44 does not match 
source 109
  
  Command that reproduces a Nova migration using libvirt client (and
  reproduces the same error):
  
  virsh migrate instance-00000002 qemu+tls://juju-596fd1-1.lxd/system
  --live --p2p --persistent --undefinesource --copy-storage-inc --migrate-
  disks vda --xml migratable.xml --persistent-xml migratable.xml
  --bandwidth 0
  
  Attached to this bug you will find:
  - instance.xml: domain dumped through virsh
  - migratable.xml: domain drump through virsh using --migratable (same flags 
as nova updated xml)
  - libvirtd.log: libvirt daemon debug logs showcasing why it refused to migrate
  
  As you can see in the logs from libvirtd.log, the method
  virDomainDefCheckABIStabilityFlags fails because the src has 65 VMX
  additional features that are not found on the destination.
  
  (Both hypervisors are hosted on LXD VMs on the same physical machines
  i.e. same cpu flags)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2083986

Title:
  Live migration fails because VMX features are missing on target cpu
  definition

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2083986/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to