In working on a recent Nova migration bug https://bugs.launchpad.net/nova/+bug/1414065
I had cause to refactor the way the nova libvirt driver monitors live migration completion/failure/progress. This refactor has opened the door for doing more intelligent active management of the live migration process. As it stands today, we launch live migration, with a possible bandwidth limit applied and just pray that it succeeds eventually. It might take until the end of the universe and we'll happily wait that long. This is pretty dumb really and I think we really ought to do better. The problem is that I'm not really sure what "better" should mean, except for ensuring it doesn't run forever. As a demo, I pushed a quick proof of concept showing how we could easily just abort live migration after say 10 minutes https://review.openstack.org/#/c/151665/ There are a number of possible things to consider though... First how to detect when live migration isn't going to succeeed. - Could do a crude timeout, eg allow 10 minutes to succeeed or else. - Look at data transfer stats (memory transferred, memory remaining to transfer, disk transferred, disk remaining to transfer) to determine if it is making forward progress. - Leave it upto the admin / user to decided if it has gone long enough The first is easy, while the second is harder but probably more reliable and useful for users. Second is a question of what todo when it looks to be failing - Cancel the migration - leave it running on source. Not good if the admin is trying to evacuate a host. - Pause the VM - make it complete as non-live migration. Not good if the guest workload doesn't like being paused - Increase the bandwidth permitted. There is a built-in rate limit in QEMU overridable via nova.conf. Could argue that the admin should just set their desired limit in nova.conf and be done with it, but perhaps there's a case for increasing it in special circumstances. eg emergency evacuate of host it is better to waste bandwidth & complete the job, but for non-urgent scenarios better to limit bandwidth & accept failure ? - Increase the maximum downtime permitted. This is the small time window when the guest switches from source to dest. To small and it'll never switch, too large and it'll suffer unacceptable interuption. We could do some of these things automatically based on some policy or leave them upto the cloud admin/tenant user via new APIs Third there's question of other QEMU features we could make use of to stop problems in the first place - Auto-converge flag - if you set this QEMU throttles back the CPUs so the guest cannot dirty ram pages as quickly. This is nicer than pausing CPUs altogether, but could still be an issue for guests which have strong performance requirements - Page compression flag - if you set this QEMU does compression of pages to reduce data that has to be sent. This is basically trading off network bandwidth vs CPU burn. Probably a win unless you are already highly overcomit on CPU on the host Fourth there's a question of whether we should give the tenant user or cloud admin further APIs for influencing migration - Add an explicit API for cancelling migration ? - Add APIs for setting tunables like downtime, bandwidth on the fly ? - Or drive some of the tunables like downtime, bandwidth, or policies like cancel vs paused from flavour or image metadata properties ? - Allow operations like evacuate to specify a live migration policy eg switch non-live migrate after 5 minutes ? The current code is so crude and there's a hell of alot of options we can take. I'm just not sure which is the best direction for us to go in. What kind of things would be the biggest win from Operators' or tenants' POV ? Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators