Peter Xu <pet...@redhat.com> writes:

> On Tue, Oct 08, 2024 at 03:28:30PM -0300, Fabiano Rosas wrote:
>> >>> +    /* Close cpr socket to tell source that we are listening */
>> >>> +    cpr_state_close();
>> >> 
>> >> Would it be possible to use some explicit reply message to mark this?  
>> >
>> > In theory yes, but I fear that using a return channel with message parsing 
>> > and
>> > dispatch adds more code than it is worth.
>> 
>> I think this approach is fine for now, but I wonder whether we could
>> reuse the current return path (RP) by starting it earlier and take
>> benefit from it already having the message passing infrastructure in
>> place. I'm actually looking ahead to the migration handshake thread[1],
>> which could be thought to have some similarity with the early cpr
>> channel. So having a generic channel in place early on to handle
>> handshake, CPR, RP, etc. could be a good idea.
>
> The current design relies on CPR stage happens before device realize()s, so
> I assume migration channel (including RP) isn't easily applicable at as
> early as this stage.

Well, what is the dependency for the RP? If we can send CPR state, we
can send QEMU_VM_COMMAND, no?

>
> However I think dest qemu can directly write back to the cpr_uri channel
> instead if we want and then follow a protocol simple enough (even though
> it'll be separate from the migration stream protocol).
>
> What worries me more (besides using HUP as of now..) is cpr_state_save() is
> currently synchronous and can block the main iothread.  It means if cpr
> destination is not properly setup, it can hang the main thread (including
> e.g. QMP monitor) at qio_channel_socket_connect_sync().  Ideally we
> shouldn't block the main thread.
>
> If async-mode can be done, it might be even easier, e.g. if we do
> cpr_state_save() in a thread, after qemu_put*() we can directly qemu_get*()
> in the same context with the pairing return qemufile.
>
> But maybe we can do it in two steps, merging HUP first.  Then when a better
> protocol (plus async mode) ready, one can boost QEMU_CPR_FILE_VERSION.
> I'll see how Steve wants to address it.

I agree HUP is fine at the moment.

>
>> 
>> Anyway, I'm probing on this a bit so I can start drafting something. I
>> got surprised that we don't even have the capability bits in the stream
>> in a useful way (currently, configuration_validate_capabilities() does
>> kind of nothing).
>> 
>> 1- https://wiki.qemu.org/ToDo/LiveMigration#Migration_handshake
>
> Happy to know this. I was thinking whether I should work on this even
> earlier, so if you're looking at that it'll be great.

As of half an hour ago =) We could put a feature branch up and work
together, if you have more concrete thoughts on how this would look like
let me know.

>
> The major pain to me is the channel establishment part where we now have
> all kinds of channels, so we should really fix that sooner (e.g., we hope
> to enable multifd + postcopy very soon, that requires multifd and preempt
> channels appear in the same time).  It was reasonable the vfio/multifd
> series tried to fix it.


Reply via email to