Public bug reported: Description ===========
If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated. Steps to reproduce ================== 1. Create two iscsi volumes. # cinder create --name test_vol1 --volume-type iscsi 1 # cinder create --name test_vol2 --volume-type iscsi 1 (--volume-type iscsi isn't mandatory - in my devstack environment there is no iscsi volume-type, but that doesn't stop me from reproducing this bug) 2. Boot an instance. # nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1 3. Attach one iscsi volume to testvm1. # nova volume-attach testvm1 $test_vol1 4. Do volume-update to swap volume to 2nd one. (1st time volume-update) # nova volume-update testvm1 $test_vol1 $test_vol2 5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update) # nova volume-update testvm1 $test_vol2 $test_vol1 6. Live migrate instance to other compute node. # nova live-migration testvm1 Expected result =============== Live migration succeeds. Actual result ============= Live migration fails with: Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site- packages/libvirt.py", line 1939, in migrateToURI3 Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb Environment =========== This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well. Additional information ====================== There are two things going on here. 1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because: 2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446 [2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase [3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1691195 Title: Can't live-migrate after "round-trip" volume-upate Status in OpenStack Compute (nova): New Bug description: Description =========== If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated. Steps to reproduce ================== 1. Create two iscsi volumes. # cinder create --name test_vol1 --volume-type iscsi 1 # cinder create --name test_vol2 --volume-type iscsi 1 (--volume-type iscsi isn't mandatory - in my devstack environment there is no iscsi volume-type, but that doesn't stop me from reproducing this bug) 2. Boot an instance. # nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1 3. Attach one iscsi volume to testvm1. # nova volume-attach testvm1 $test_vol1 4. Do volume-update to swap volume to 2nd one. (1st time volume-update) # nova volume-update testvm1 $test_vol1 $test_vol2 5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update) # nova volume-update testvm1 $test_vol2 $test_vol1 6. Live migrate instance to other compute node. # nova live-migration testvm1 Expected result =============== Live migration succeeds. Actual result ============= Live migration fails with: Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7 /site-packages/libvirt.py", line 1939, in migrateToURI3 Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb Environment =========== This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well. Additional information ====================== There are two things going on here. 1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because: 2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446 [2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase [3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1691195/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp