On Fri, Aug 16, 2024 at 11:34:10AM -0400, Peter Xu wrote:
> On Fri, Aug 16, 2024 at 04:16:50PM +0100, Daniel P. Berrangé wrote:
> > On Fri, Aug 16, 2024 at 11:06:10AM -0400, Peter Xu wrote:
> > > On Thu, Aug 15, 2024 at 04:55:20PM -0400, Steven Sistare wrote:
> > > > On 8/13/2024 3:46 PM, Peter Xu wrote:
> > > > > On Tue, Aug 06, 2024 at 04:56:18PM -0400, Steven Sistare wrote:
> > > > > > > The flipside, however, is that localhost migration via 2 separate 
> > > > > > > QEMU
> > > > > > > processes has issues where both QEMUs want to be opening the very 
> > > > > > > same
> > > > > > > file, and only 1 of them can ever have them open.
> > > > > 
> > > > > I thought we used to have similar issue with block devices, but I 
> > > > > assume
> > > > > it's solved for years (and whoever owns it will take proper file lock,
> > > > > IIRC, and QEMU migration should properly serialize the time window on 
> > > > > who's
> > > > > going to take the file lock).
> > > > > 
> > > > > Maybe this is about something else?
> > > > 
> > > > I don't have an example where this fails.
> > > > 
> > > > I can cause "Failed to get "write" lock" errors if two qemu instances 
> > > > open
> > > > the same block device, but the error is suppressed if you add the 
> > > > -incoming
> > > > argument, due to this code:
> > > > 
> > > >   blk_attach_dev()
> > > >     if (runstate_check(RUN_STATE_INMIGRATE))
> > > >       blk->disable_perm = true;
> > > 
> > > Yep, this one is pretty much expected.
> > > 
> > > > 
> > > > > > Indeed, and "files" includes unix domain sockets.
> > > > 
> > > > More on this -- the second qemu to bind a unix domain socket for 
> > > > listening
> > > > wins, and the first qemu loses it (because second qemu unlinks and 
> > > > recreates
> > > > the socket path before binding on the assumption that it is stale).
> > > > 
> > > > One must use a different name for the socket for second qemu, and 
> > > > clients
> > > > that wish to connect must be aware of the new port.
> > > > 
> > > > > > Network ports also conflict.
> > > > > > cpr-exec avoids such problems, and is one of the advantages of the 
> > > > > > method that
> > > > > > I forgot to promote.
> > > > > 
> > > > > I was thinking that's fine, as the host ports should be the backend 
> > > > > of the
> > > > > VM ports only anyway so they don't need to be identical on both sides?
> > > > > 
> > > > > IOW, my understanding is it's the guest IP/ports/... which should 
> > > > > still be
> > > > > stable across migrations, where the host ports can be different as 
> > > > > long as
> > > > > the host ports can forward guest port messages correctly?
> > > > 
> > > > Yes, one must use a different host port number for the second qemu, and 
> > > > clients
> > > > that wish to connect must be aware of the new port.
> > > > 
> > > > That is my point -- cpr-transfer requires fiddling with such things.
> > > > cpr-exec does not.
> > > 
> > > Right, and my understanding is all these facilities are already there, so
> > > no new code should be needed on reconnect issues if to support 
> > > cpr-transfer
> > > in Libvirt or similar management layers that supports migrations.
> > 
> > Note Libvirt explicitly blocks localhost migration today because
> > solving all these clashing resource problems is a huge can of worms
> > and it can't be made invisible to the user of libvirt in any practical
> > way.
> 
> Ahhh, OK.  I'm pretty surprised by this, as I thought at least kubevirt
> supported local migration somehow on top of libvirt.

Since kubevirt runs inside a container, "localhost" migration
is effectively migrating between 2 completely separate OS installs
(containers), that happen to be on the same physical host. IOW, it
is a cross-host migration from Libvirt & QEMU's POV, and there are
no clashing resources to worry about.

> Does it mean that cpr-transfer is a no-go in this case at least for Libvirt
> to consume it (as cpr-* is only for local host migrations so far)?  Even if
> all the rest issues we're discussing with cpr-exec, is that the only way to
> go for Libvirt, then?

cpr-exec is certainly appealing from the POV of avoiding the clashing
resources problem in libvirt.

It has own issues though, because libvirt runs all QEMU processes with
seccomp filters that block 'execve', as we consider QEMU to be untrustworthy
and thus don't want to allow it to exec anything !

I don't know which is the lesser evil from libvirt's POV.

Personally I see security controls as an overriding requirement for
everything.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to