Juan Quintela <quint...@redhat.com> writes: > Fabiano Rosas <faro...@suse.de> wrote: >> This doubly linked list is common for all the multifd and migration >> threads so we need to avoid concurrent access. >> >> Add a mutex to protect the data from concurrent access. This fixes a >> crash when removing two MigrationThread objects from the list at the >> same time during cleanup of multifd threads. >> >> To avoid destroying the mutex before the last element has been >> removed, move calls to qmp_migration_thread_remove so they run before >> multifd_save_cleanup joins the threads. >> >> Fixes: 671326201d ("migration: Introduce interface query-migrationthreads") >> Signed-off-by: Fabiano Rosas <faro...@suse.de> > > I agree with Peter here. Why don't you have to protect the walking? >
Oversight on my part. >> --- >> migration/migration.c | 5 ++++- >> migration/multifd.c | 3 ++- >> migration/threadinfo.c | 19 ++++++++++++++++++- >> migration/threadinfo.h | 5 +++-- >> 4 files changed, 27 insertions(+), 5 deletions(-) >> >> diff --git a/migration/migration.c b/migration/migration.c >> index e731fc98a1..b3b8345eb2 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -1146,6 +1146,7 @@ static void migrate_fd_cleanup(MigrationState *s) >> qemu_mutex_lock_iothread(); >> >> multifd_save_cleanup(); >> + qmp_migration_threads_cleanup(); > > I think I will spare this one as the mutex is static, so we are not > winning any memory back. > Ok >> } >> >> trace_migration_thread_after_loop(); >> + qmp_migration_threads_remove(thread); >> migration_iteration_finish(s); > > I can understand moving it here, but why before migration_iteration_finish? > Because migration_iteration_finish schedules migrate_fd_cleanup and it calls qmp_migration_threads_cleanup. So I wanted to be sure that the removal happens before destroying the mutex. >> object_unref(OBJECT(s)); >> rcu_unregister_thread(); >> - qmp_migration_threads_remove(thread); >> return NULL; >> } >> + qmp_migration_threads_remove(thread); >> + >> qemu_mutex_lock(&p->mutex); >> p->running = false; >> qemu_mutex_unlock(&p->mutex); >> >> rcu_unregister_thread(); >> - qmp_migration_threads_remove(thread); >> trace_multifd_send_thread_end(p->id, p->num_packets, >> p->total_normal_pages); >> >> return NULL; > > Here it looks like the right place. > Yep, we shouldn't really put any new code after that p->running = false. Because multifd_save_cleanup will happily start cleaning up everything while this thread is still running if it sees p->running == false. > >> +#include "qemu/osdep.h" >> +#include "qemu/queue.h" >> +#include "qemu/lockable.h" >> #include "threadinfo.h" > > Ouch, it missed Markus cleanup. Thanks. > > For the rest it looks good. > > Later, Juan.