multifd: Move payload storage out of the channel parameters

Fabiano Rosas Thu, 18 Jul 2024 12:39:56 -0700

Peter Xu <pet...@redhat.com> writes:

> On Thu, Jul 11, 2024 at 11:12:09AM -0300, Fabiano Rosas wrote:
>> What about the QEMUFile traffic? There's an iov in there. I have been
>> thinking of replacing some of qemu-file.c guts with calls to
>> multifd. Instead of several qemu_put_byte() we could construct an iov
>> and give it to multifd for transfering, call multifd_sync at the end and
>> get rid of the QEMUFile entirely. I don't have that completely laid out
>> at the moment, but I think it should be possible. I get concerned about
>> making assumptions on the types of data we're ever going to want to
>> transmit. I bet someone thought in the past that multifd would never be
>> used for anything other than ram.
>
> Hold on a bit.. there're two things I want to clarity with you.
>
> Firstly, qemu_put_byte() has buffering on f->buf[].  Directly changing them
> to iochannels may regress performance.  I never checked, but I would assume
> some buffering will be needed for small chunk of data even with iochannels.
>
> Secondly, why multifd has things to do with this?  What you're talking
> about is more like the rework of qemufile->iochannel thing to me, and IIUC
> that doesn't yet involve multifd.  For many of such conversions, it'll
> still be operating on the main channel, which is not the multifd channels.
> What matters might be about what's in your mind to be put over multifd
> channels there.
>
>> 
>> >
>> > I wonder why handshake needs to be done per-thread.  I was naturally
>> > thinking the handshake should happen sequentially, talking over everything
>> > including multifd.
>> 
>> Well, it would still be thread based. Just that it would be 1 thread and
>> it would not be managed by multifd. I don't see the point. We could make
>> everything be multifd-based. Any piece of data that needs to reach the
>> other side of the migration could be sent through multifd, no?
>
> Hmm.... yes we can. But what do we gain from it, if we know it'll be a few
> MBs in total?  There ain't a lot of huge stuff to move, it seems to me.
>
>> 
>> Also, when you say "per-thread", that's the model we're trying to get
>> away from. There should be nothing "per-thread", the threads just
>> consume the data produced by the clients. Anything "per-thread" that is
>> not strictly related to the thread model should go away. For instance,
>> p->page_size, p->page_count, p->write_flags, p->flags, etc. None of
>> these should be in MultiFDSendParams. That thing should be (say)
>> MultifdChannelState and contain only the semaphores and control flags
>> for the threads.
>> 
>> It would be nice if we could once and for all have a model that can
>> dispatch data transfers without having to fiddle with threading all the
>> time. Any time someone wants to do something different in the migration
>> code, there it goes a random qemu_create_thread() flying around.
>
> That's exactly what I want to avoid.  Not all things will need a thread,
> only performance relevant ones.
>
> So now we have multifd threads, they're for IO throughputs: if we want to
> push a fast NIC, that's the only way to go.  Anything wants to push that
> NIC, should use multifd.
>
> Then it turns out we want more concurrency, it's about VFIO save()/load()
> of the kenrel drivers and it can block.  Same to other devices that can
> take time to save()/load() if it can happen concurrently in the future.  I
> think that's the reason why I suggested the VFIO solution to provide a
> generic concept of thread pool so it services a generic purpose, and can be
> reused in the future.
>
> I hope that'll stop anyone else on migration to create yet another thread
> randomly, and I definitely don't like that either.  I would _suspect_ the
> next one to come as such is TDX.. I remember at least in the very initial
> proposal years ago, TDX migration involves its own "channel" to migrate,
> migration.c may not even know where is that channel.  We'll see.
>
> [...]
>
>> > One thing to mention is that when with an union we may probably need to get
>> > rid of multifd_send_state->pages already.
>> 
>> Hehe, please don't do this like "oh, by the way...". This is a major
>> pain point. I've been complaining about that "holding of client data"
>> since the fist time I read that code. So if you're going to propose
>> something, it needs to account for that.
>
> The client puts something into a buffer (SendData), then it delivers it to
> multifd (who silently switches the buffer).  After enqueued, the client
> assumes the buffer is sent and reusable again.
>
> It looks pretty common to me, what is the concern within the procedure?
> What's the "holding of client data" issue?
>


v2 is ready, but unfortunately this approach doesn't work. When client A
takes the payload, it fills it with it's data, which may include
allocating memory. MultiFDPages_t does that for the offset. This means
we need a round of free/malloc at every packet sent. For every client
and every allocation they decide to do.

Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the channel parameters

Reply via email to