Apologies for the indirect quote, some of the earlier posts got deleted before I noticed the thread.

On 09/21/2015 03:43 AM, Koniszewski, Pawel wrote:
-----Original Message-----
From: Daniel P. Berrange [mailto:berra...@redhat.com]

There was a proposal to nova to allow the 'pause' operation to be invoked
while migration was happening. This would turn a live migration into a
coma-migration, thereby ensuring it succeeds. I cna't remember if this
merged or not, as i can't find the review offhand, but its important to
have this ASAP IMHO, as when evacuating VMs from a host admins need a knob
to use to force successful evacuation, even at the cost of pausing the
guest temporarily.

It's not strictly "live" migration, but for the same reason of pushing VMs off a host for maintenance it would be nice to have some way of migrating suspended instances. (As brought up in http://lists.openstack.org/pipermail/openstack-dev/2015-September/075042.html)

In libvirt upstream we now have the ability to filter what disks are
migrated during block migration. We need to leverage that new feature to
fix the long standing problems of block migration when non-local images are
attached - eg cinder volumes. We definitely want this in Mitaka.

Agreed, this would be a very useful addition.

We should look at what we need to do to isolate the migration data network
from the main management network. Currently we live migrate over whatever
network is associated with the compute hosts primary Hostname / IP address.
This is not neccessarily the fastest NIC on the host. We ought to be able
to record an alternative hostname / IP address against each compute host to
indicate the desired migration interface.

Yes, this would be good to have upstream. We've added this sort of thing locally (though with a hardcoded naming scheme) to allow migration over 10G links with management over 1G links.

There is also work on post-copy migration in QEMU. Normally with live
migration, the guest doesn't start executing on the target host until
migration has transferred all data. There are many workloads where that
doesn't work, as the guest is dirtying data too quickly, With post-copy you
can start running the guest on the target at any time, and when it faults
on a missing page that will be pulled from the source host. This is
slightly more fragile as you risk loosing the guest entirely if the source
host dies before migration finally completes. It does guarantee that
migration will succeed no matter what workload is in the guest. This is
probably Nxxxx cycle material.

It seems to me that the ideal solution would be to start doing pre-copy migration, then if that doesn't converge with the specified downtime value then maybe have the option to just cut over to the destination and do a post-copy migration of the remaining data.

Chris

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to