"manish.mishra" <manish.mis...@nutanix.com> wrote: > On 26/04/23 3:58 pm, Juan Quintela wrote: >> "manish.mishra" <manish.mis...@nutanix.com> wrote: >>> multifd_send_sync_main() posts request on the multifd channel >>> but does not call sem_wait() on channels_ready semaphore, making >>> the channels_ready semaphore count keep increasing. >>> As a result, sem_wait() on channels_ready in multifd_send_pages() >>> is always non-blocking hence multifd_send_pages() keeps searching >>> for a free channel in a busy loop until a channel is freed. >>> >>> Signed-off-by: manish.mishra <manish.mis...@nutanix.com> >>> --- >>> migration/multifd.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/migration/multifd.c b/migration/multifd.c >>> index cce3ad6988..43d26e7012 100644 >>> --- a/migration/multifd.c >>> +++ b/migration/multifd.c >>> @@ -615,6 +615,7 @@ int multifd_send_sync_main(QEMUFile *f) >>> trace_multifd_send_sync_main_signal(p->id); >>> + qemu_sem_wait(&multifd_send_state->channels_ready); >>> qemu_mutex_lock(&p->mutex); >>> if (p->quit) { >> We need this, but I think it is better to put it on the second loop. >> >>> @@ -919,7 +920,7 @@ int multifd_save_setup(Error **errp) >>> multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); >>> multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); >>> multifd_send_state->pages = multifd_pages_init(page_count); >>> - qemu_sem_init(&multifd_send_state->channels_ready, 0); >>> + qemu_sem_init(&multifd_send_state->channels_ready, thread_count); >>> qatomic_set(&multifd_send_state->exiting, 0); >>> multifd_send_state->ops = multifd_ops[migrate_multifd_compression()]; >> I think this bit is wrong. >> We should not set the channels ready until the thread is ready and >> channel is created. >> >> What do you think about this patch: >> >> From bcb0ef9b97b858458c403d2e4dc9e0dbd96721b3 Mon Sep 17 00:00:00 2001 >> From: Juan Quintela <quint...@redhat.com> >> Date: Wed, 26 Apr 2023 12:20:36 +0200 >> Subject: [PATCH] multifd: Fix the number of channels ready >> >> We don't wait in the sem when we are doing a sync_main. Make it wait >> there. To make things clearer, we mark the channel ready at the >> begining of the thread loop. >> >> This causes a busy loop in multifd_send_page(). >> Found-by: manish.mishra <manish.mis...@nutanix.com> >> >> Signed-off-by: Juan Quintela <quint...@redhat.com> >> --- >> migration/multifd.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/migration/multifd.c b/migration/multifd.c >> index 903df2117b..e625e8725e 100644 >> --- a/migration/multifd.c >> +++ b/migration/multifd.c >> @@ -635,6 +635,7 @@ int multifd_send_sync_main(QEMUFile *f) >> for (i = 0; i < migrate_multifd_channels(); i++) { >> MultiFDSendParams *p = &multifd_send_state->params[i]; >> + qemu_sem_wait(&multifd_send_state->channels_ready); >> trace_multifd_send_sync_main_wait(p->id); >> qemu_sem_wait(&p->sem_sync); >> @@ -668,6 +669,7 @@ static void *multifd_send_thread(void *opaque) >> p->num_packets = 1; >> while (true) { >> + qemu_sem_post(&multifd_send_state->channels_ready); > > > This has one issue though, if we mark channel_ready here itself, channel is > actually not ready so we can still busy loop?
Before: while (true) { .... sem_post(channels_ready) } And you want to add to the initialization a counter equal to the number of channels. Now: while (true) { sem_post(channels_ready) .... } It is semantically the same, but when we setup it ready it means that when we set it to 1, we now that the channel and thread are ready for action. > May be we can do one thing let the sem_post in while loop at same > position itself. But we can do another post just before start I can see how this can make any difference. > of this while loop, as that will be called only once it should do work > of initialising count equal to multiFD channels? Yeap. But I can see what difference do we have here. Later, Juan.