postcopy: magic value for postcopy channel

Daniel P . Berrangé Thu, 07 Nov 2024 09:48:45 -0800

On Thu, Nov 07, 2024 at 05:35:06PM +0530, Prasad Pandit wrote:
> On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas <faro...@suse.de> wrote:
> > What we're thinking is having an initial exchange of information between
> > src & dst as soon as migration starts and that would sync the
> > capabilities and parameters between both sides. Which would then be
> > followed by a channel establishment phase that would open each necessary
> > channel (according to caps) in order, removing the current ambiguity.
> >
> 
> * Isn't that how it works? IIUC, libvirtd(8) sends migration command
> options to the destination and based on that the destination prepares
> for the multifd and/or postcopy migration. In case of 'Postcopy' the
> source sends 'postcopy advise' to the destination to indicate that
> postcopy might follow at the end of precopy. Also, in the discussion
> above Peter mentioned that libvirtd(8) may exchange list of features
> between source and destination to facilitate QMP clients.
> 
> * What is the handshake doing differently? (just trying to understand)


Libvirt does what it does because it has had no other choice,
not because it was good or desirable.

This kind of handshake really does not belong in libvirt. A number
of exposed migration protocol feature knobs should be considered
private to QEMU only.

It has the very negative consequence that every time QEMU wants to
provide a new feature in migration, it needs to be plumbed up through
libvirt, and often applications above, and those 3rd party projects
need to be told when & where to use the new features. The 3rd party
developers have their own project dev priorities so may not get
around to enable the new migration features for years, if ever,
undermining the work of QEMU's migration maintainers.

As examples...

If we had QEMU self-negotiation of features 10 years ago, everywhere
would already be using multifd out of the box. QEMU would have been
able to self-negotiate use of the new "multifd" protocol, and QEMU
would be well on its way to being able to delete the old single-
threaded migration code.

Similarly post-copy would have been way easier for apps, QEMU would
auto-negotiate a channel for the post-copy async page fetching. All
migrations would be running with the post-copy feature available.
All that libvirt & apps would have needed was a API to initiate the
switch to post-copy mode.

Or the hacks QEMU has put in place where we peek at incoming data
on some channels  to identify the channel type would not exist.


TL;DR: once QEMU can self-negotiate features for migration itself,
the implementation burden for libvirt & applications is greatly
reduced. QEMU migration maintainers will control their own destiny,
able to deliver improvements to users much more quickly, be able
to delete obsolete features more quickly, and be able to make
migration *automatically* enable new features & pick the optimal
defaults on their own expert knowledge, not waitnig for 3rd parties
to pay attention years later.

Some things will still need work & decisions in libvirt & apps,
but this burden should be reduced compared over the long term.
Ultimately everyone will win.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

Reply via email to