On Thu, Nov 07, 2024 at 05:35:06PM +0530, Prasad Pandit wrote: > On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas <faro...@suse.de> wrote: > > What we're thinking is having an initial exchange of information between > > src & dst as soon as migration starts and that would sync the > > capabilities and parameters between both sides. Which would then be > > followed by a channel establishment phase that would open each necessary > > channel (according to caps) in order, removing the current ambiguity. > > > > * Isn't that how it works? IIUC, libvirtd(8) sends migration command > options to the destination and based on that the destination prepares > for the multifd and/or postcopy migration. In case of 'Postcopy' the > source sends 'postcopy advise' to the destination to indicate that > postcopy might follow at the end of precopy. Also, in the discussion > above Peter mentioned that libvirtd(8) may exchange list of features > between source and destination to facilitate QMP clients. > > * What is the handshake doing differently? (just trying to understand)
Libvirt does what it does because it has had no other choice, not because it was good or desirable. This kind of handshake really does not belong in libvirt. A number of exposed migration protocol feature knobs should be considered private to QEMU only. It has the very negative consequence that every time QEMU wants to provide a new feature in migration, it needs to be plumbed up through libvirt, and often applications above, and those 3rd party projects need to be told when & where to use the new features. The 3rd party developers have their own project dev priorities so may not get around to enable the new migration features for years, if ever, undermining the work of QEMU's migration maintainers. As examples... If we had QEMU self-negotiation of features 10 years ago, everywhere would already be using multifd out of the box. QEMU would have been able to self-negotiate use of the new "multifd" protocol, and QEMU would be well on its way to being able to delete the old single- threaded migration code. Similarly post-copy would have been way easier for apps, QEMU would auto-negotiate a channel for the post-copy async page fetching. All migrations would be running with the post-copy feature available. All that libvirt & apps would have needed was a API to initiate the switch to post-copy mode. Or the hacks QEMU has put in place where we peek at incoming data on some channels to identify the channel type would not exist. TL;DR: once QEMU can self-negotiate features for migration itself, the implementation burden for libvirt & applications is greatly reduced. QEMU migration maintainers will control their own destiny, able to deliver improvements to users much more quickly, be able to delete obsolete features more quickly, and be able to make migration *automatically* enable new features & pick the optimal defaults on their own expert knowledge, not waitnig for 3rd parties to pay attention years later. Some things will still need work & decisions in libvirt & apps, but this burden should be reduced compared over the long term. Ultimately everyone will win. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|