I'll second much of what Rob said: API that indicated how many live-migrations (l-m) were going would be good. API that told you what progress (and start time) a given l-m had made would be great. API to cancel a given l-m would also be great. I think this is a preferred approach over an auto timeout (it would give us the tools we need to implement an auto timeout though.)
I like the idea of trying auto-convergence (and agree it should be flavor feature and likely not the default.) I suspect this one needs some testing. It may be fine to automatically do this if it doesn't actually throttle the VM some 90-99% of the time. (Presumably this could also increase the max downtime between cutover as well as throttling the vm.) Thanks Daniel/Rob, -dave fyi: I'm an operator/devel on the Time Warner Cable openstack cloud. On Sun, Feb 1, 2015 at 12:24 PM, Robert Collins <robe...@robertcollins.net> wrote: > On 31 January 2015 at 05:47, Daniel P. Berrange <berra...@redhat.com> > wrote: > > In working on a recent Nova migration bug > > > > https://bugs.launchpad.net/nova/+bug/1414065 > > > > I had cause to refactor the way the nova libvirt driver monitors live > > migration completion/failure/progress. This refactor has opened the > > door for doing more intelligent active management of the live migration > > process. > ... > > What kind of things would be the biggest win from Operators' or tenants' > > POV ? > > Awesome. Couple thoughts from my perspective. Firstly, there's a bunch > of situation dependent tuning. One thing Crowbar does really nicely is > that you specify the host layout in broad abstract terms - e.g. 'first > 10G network link' and so on : some of your settings above like whether > to compress page are going to be heavily dependent on the bandwidth > available (I doubt that compression is a win on a 100G link for > instance, and would be suspect at 10G even). So it would be nice if > there was a single dial or two to set and Nova would auto-calculate > good defaults from that (with appropriate overrides being available). > > Operationally avoiding trouble is better than being able to fix it, so > I quite like the idea of defaulting the auto-converge option on, or > perhaps making it controllable via flavours, so that operators can > offer (and identify!) those particularly performance sensitive > workloads rather than having to guess which instances are special and > which aren't. > > Being able to cancel the migration would be good. Relatedly being able > to restart nova-compute while a migration is going on would be good > (or put differently, a migration happening shouldn't prevent a deploy > of Nova code: interlocks like that make continuous deployment much > harder). > > If we can't already, I'd like as a user to be able to see that the > migration is happening (allows diagnosis of transient issues during > the migration). Some ops folk may want to hide that of course. > > I'm not sure that automatically rolling back after N minutes makes > sense : if the impact on the cluster is significant then 1 minute vs > 10 doesn't instrinsically matter: what matters more is preventing too > many concurrent migrations, so that would be another feature that I > don't think we have yet: don't allow more than some N inbound and M > outbound live migrations to a compute host at any time, to prevent IO > storms. We may want to log with NOTIFICATION migrations that are still > progressing but appear to be having trouble completing. And of course > an admin API to query all migrations in progress to allow API driven > health checks by monitoring tools - which gives the power to manage > things to admins without us having to write a probably-too-simple > config interface. > > HTH, > Rob > > -- > Robert Collins <rbtcoll...@hp.com> > Distinguished Technologist > HP Converged Cloud > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >
_______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators