Re: [Qemu-devel] [PATCH 00/46] Postcopy implementation

Dr. David Alan Gilbert Thu, 10 Jul 2014 18:35:24 -0700

* Andrea Arcangeli (aarca...@redhat.com) wrote:
> On Thu, Jul 10, 2014 at 02:37:43PM +0100, Dr. David Alan Gilbert wrote:
> > * Eric Blake (ebl...@redhat.com) wrote:
> > > Is there any need for an
> > > event telling libvirt that enough pre-copy has occurred to make a
> > > postcopy worthwhile?
> > 
> > I'm not sure that qemu knows much more than management does at that
> > point; any such decision you can make based on an arbitrary cut off
> > (i.e. migration is taking too long) or you could consider something
> > based on some of the other stats that migration already exposes
> > (like the dirty pages stats); if we've got any more stats that you
> > need we can always expose them.
> >
> > Agreed; although we can just do that independently of this big patch set.
> 
> It can be independent yes, but I think such event is needed (and once
> we add such event I hope we can get rid of the polling libvirt is
> doing for pure precopy too).
> 
> I think for very large guests what should happen is a single _lazy_
> pass of precopy and then immediately postcopy.
> 
> That's why I think an event that notifies libvirt when it should issue
> the postcopy command is good, to be able to implement the single
> _lazy_ pass and nothing more than that.
> 
> qemu should stop precopy and the source guest just before sending the
> event, so then libvirt can assign all storage to the destination just
> before issuing the postcopy commmand. By the time the event has been
> raised by qemu, the guest in the source qemu must never run
> anymore. So it is actually the same event needed in pure precopy too
> (except when using precopy+postcopy the "precopy complete" event will
> fire much sooner). We'll still need a parameter to precopy to tell
> qemu when precopy should stop.


That's an interesting different type of event; I think we probably
have that first pass information but it's not part of the 'state'
(i.e. whether it's started/completed/cancelled enum).

> The single precopy lazy pass would consist of clearing the dirty
> bitmap, starting precopy, then if any page is found dirty by the time
> precopy tries to send it, we skip it. We only send those pages in
> precopy that haven't been modified yet by the time we reach them in
> precopy.
> 
> Pages heavily modified will be sent purely through
> postcopy. Ultimately postcopy will be a page sorting feature to
> massively decrease the downtime latency, and to reduce to 2*ramsize
> the maximum amount of data transferred on the network without having
> to slow down the guest artificially. We'll also know exactly the
> maximum time in advance that it takes to migrate a large host no
> matter the load in it (2*ramsize divided by the network bandwidth
> available at the migration time). It'll be totally deterministic, no
> black magic slowdowns anymore.

There is a trade off;  killing the precopy does reduce network bandwidth,
but the other side of it is that you would incur more postcopy round trips,
so your average latency will probably increase.

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 00/46] Postcopy implementation

Reply via email to