On Fri, Mar 07, 2025 at 04:06:17PM -0300, Fabiano Rosas wrote:
> > I never tried vsock, would it be used in any use case?
> >
> 
> I don't know, I'm going by what's in the code.
> 
> > It seems to be introduced by accident in 72a8192e225cea, but I'm not sure.
> > Maybe there's something I missed.
> 
> The code was always had some variation of:
> 
> static bool transport_supports_multi_channels(SocketAddress *saddr)
> {
>     return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) ||
>            strstart(uri, "vsock:", NULL);
> }
> 
> Introduced by b7acd65707 ("migration: allow multifd for socket protocol
> only").

Looks like a copy-paste error..  I found it, should be 9ba3b2baa1
("migration: add vsock as data channel support").

https://lore.kernel.org/all/2bc0e226-ee71-330a-1bcd-bd9d09750...@huawei.com/
https://kvmforum2019.sched.com/event/Tmzh/zero-next-generation-virtualization-platform-for-huawei-cloud-jinsong-liu-zhichao-huang-huawei
https://e.huawei.com/sa/material/event/HC/e37b9c4c33e14e869bb1183fab468fed

So if I read it right.. the VM in this case is inside a container or
something, then it talks to an "agent" on a PCIe device which understands
virtio-vsock protocol.  So maybe vsock just performs better than other ways
to dig that tunnel for the container.

In that case, mentioning vsock is at least ok.

[...]

> >> +After the channels have been put into a wait state by the sync
> >> +functions, the client code may continue to transmit additional data by
> >> +issuing ``multifd_send()`` once again.
> >> +
> >> +Note:
> >> +
> >> +- the RAM migration does, effectively, a global synchronization by
> >> +  chaining a call to ``multifd_send_sync_main()`` with the emission of a
> >> +  flag on the main migration channel (``RAM_SAVE_FLAG_MULTIFD_FLUSH``)
> >
> > ... or RAM_SAVE_FLAG_EOS ... depending on the machine type.
> >
> 
> Eh.. big compatibility mess. I rather not mention it.

It's not strictly a compatibility mess.  IIUC, it was used to be designed
to always work with EOS.  I think at that time Juan was still focused on
making it work and not whole perf tunings, but then we found it can be a
major perf issue if we flush too soon.  Then if we flush it once per round,
it may not always pair with a EOS.  That's why we needed a new message.

But hey, you're writting a doc that helps everyone.  You deserve to decide
whether you like to mention it or not on this one. :)

IIRC we updated our compat rule so we maintain each machine type for only 6
years.  It means the whole per-iteration + EOS stuff can be removed in 3.5
years or so - we did that work in July 2022.  So it isn't that important
either to mention indeed.

> 
> > Maybe we should also add a sentence on the relationship of
> > MULTIFD_FLAG_SYNC and RAM_SAVE_FLAG_MULTIFD_FLUSH (or RAM_SAVE_FLAG_EOS ),
> > in that they should always be sent together, and only if so would it
> > provide ordering of multifd messages and what happens in the main migration
> > thread.
> >
> 
> The problem is that RAM_SAVE_FLAGs are a ram.c thing. In theory the need
> for RAM_SAVE_FLAG_MULTIFD_FLUSH is just because the RAM migration is
> driven by the source machine by the flags that are put on the
> stream. IOW, this is a RAM migration design, not a multifd design. The
> multifd design is (could be, we decide) that once sync packets are sent,
> _something_ must do the following:
> 
>     for (i = 0; i < thread_count; i++) {
>         trace_multifd_recv_sync_main_wait(i);
>         qemu_sem_wait(&multifd_recv_state->sem_sync);
>     }
> 
> ... which is already part of multifd_recv_sync_main(), but that just
> _happens to be_ called by ram.c when it sees the
> RAM_SAVE_FLAG_MULTIFD_FLUSH flag on the stream, that's not a multifd
> design requirement. The ram.c code could for instance do the sync when
> some QEMU_VM_SECTION_EOS (or whatever it's called) appears.

I still think it should be done in RAM code only.  One major goal (if not
the only goal..) is it wants to order different versions of pages and
that's only what the RAM module is about, not migration in general.

>From that POV, having a QEMU_VM_* is kind of the wrong layer - they should
work for migration in general.

Said so, I agree we do violate it from time to time, for example, we have a
bunch of subcmds (MIG_CMD_POSTCOPY*) just for postcopy, which is under
QEMU_VM_COMMAND.  But IIUC that was either kind of ancient (so we need to
stick with them now.. postcopy was there for 10 years) or it needs some
ping-pong messages in which case QEMU_VM_COMMAND is the easiest.. IMHO we
should still try to stick with the layering if possible.

-- 
Peter Xu


Reply via email to