On Thu, Aug 07, 2025 at 10:41:17AM +0800, yong.hu...@smartx.com wrote: > diff --git a/migration/multifd.c b/migration/multifd.c > index b255778855..aca0aeb341 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -1228,6 +1228,16 @@ void multifd_recv_sync_main(void) > } > } > trace_multifd_recv_sync_main_signal(p->id); > + do { > + if (qemu_sem_timedwait(&multifd_recv_state->sem_sync, 10000) == > 0) { > + break; > + } > + if (qemu_in_coroutine()) { > + aio_co_schedule(qemu_get_current_aio_context(), > + qemu_coroutine_self()); > + qemu_coroutine_yield(); > + } > + } while (1);
I still think either yank or fixing migrate_cancel is the way to go, but when staring at this change.. I don't think I understand this patch at all. It timedwait()s on the sem_sync that we just consumed. Do you at least need to remove the ones above this piece of code to not hang forever? for (i = 0; i < thread_count; i++) { trace_multifd_recv_sync_main_wait(i); qemu_sem_wait(&multifd_recv_state->sem_sync); } > qemu_sem_post(&p->sem_sync); > } > trace_multifd_recv_sync_main(multifd_recv_state->packet_num); > -- > 2.27.0 > -- Peter Xu