Peter Xu <pet...@redhat.com> writes: > On Tue, Oct 08, 2024 at 03:28:30PM -0300, Fabiano Rosas wrote: >> >>> + /* Close cpr socket to tell source that we are listening */ >> >>> + cpr_state_close(); >> >> >> >> Would it be possible to use some explicit reply message to mark this? >> > >> > In theory yes, but I fear that using a return channel with message parsing >> > and >> > dispatch adds more code than it is worth. >> >> I think this approach is fine for now, but I wonder whether we could >> reuse the current return path (RP) by starting it earlier and take >> benefit from it already having the message passing infrastructure in >> place. I'm actually looking ahead to the migration handshake thread[1], >> which could be thought to have some similarity with the early cpr >> channel. So having a generic channel in place early on to handle >> handshake, CPR, RP, etc. could be a good idea. > > The current design relies on CPR stage happens before device realize()s, so > I assume migration channel (including RP) isn't easily applicable at as > early as this stage.
Well, what is the dependency for the RP? If we can send CPR state, we can send QEMU_VM_COMMAND, no? > > However I think dest qemu can directly write back to the cpr_uri channel > instead if we want and then follow a protocol simple enough (even though > it'll be separate from the migration stream protocol). > > What worries me more (besides using HUP as of now..) is cpr_state_save() is > currently synchronous and can block the main iothread. It means if cpr > destination is not properly setup, it can hang the main thread (including > e.g. QMP monitor) at qio_channel_socket_connect_sync(). Ideally we > shouldn't block the main thread. > > If async-mode can be done, it might be even easier, e.g. if we do > cpr_state_save() in a thread, after qemu_put*() we can directly qemu_get*() > in the same context with the pairing return qemufile. > > But maybe we can do it in two steps, merging HUP first. Then when a better > protocol (plus async mode) ready, one can boost QEMU_CPR_FILE_VERSION. > I'll see how Steve wants to address it. I agree HUP is fine at the moment. > >> >> Anyway, I'm probing on this a bit so I can start drafting something. I >> got surprised that we don't even have the capability bits in the stream >> in a useful way (currently, configuration_validate_capabilities() does >> kind of nothing). >> >> 1- https://wiki.qemu.org/ToDo/LiveMigration#Migration_handshake > > Happy to know this. I was thinking whether I should work on this even > earlier, so if you're looking at that it'll be great. As of half an hour ago =) We could put a feature branch up and work together, if you have more concrete thoughts on how this would look like let me know. > > The major pain to me is the channel establishment part where we now have > all kinds of channels, so we should really fix that sooner (e.g., we hope > to enable multifd + postcopy very soon, that requires multifd and preempt > channels appear in the same time). It was reasonable the vfio/multifd > series tried to fix it.