On Fri, Jan 19, 2024 at 08:39:18PM -0300, Fabiano Rosas wrote: > We're currently allowing the process_incoming_migration_bh bottom-half > to run without holding a reference to the 'current_migration' object, > which leads to a segmentation fault if the BH is still live after > migration_shutdown() has dropped the last reference to > current_migration. > > In my system the bug manifests as migrate_multifd() returning true > when it shouldn't and multifd_load_shutdown() calling > multifd_recv_terminate_threads() which crashes due to an uninitialized > multifd_recv_state. > > Fix the issue by holding a reference to the object when scheduling the > BH and dropping it before returning from the BH. The same is already > done for the cleanup_bh at migrate_fd_cleanup_schedule(). > > Signed-off-by: Fabiano Rosas <[email protected]> > --- > migration/migration.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/migration/migration.c b/migration/migration.c > index 219447dea1..cf17b68e57 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -648,6 +648,7 @@ static void process_incoming_migration_bh(void *opaque) > MIGRATION_STATUS_COMPLETED); > qemu_bh_delete(mis->bh); > migration_incoming_state_destroy(); > + object_unref(OBJECT(migrate_get_current())); > } > > static void coroutine_fn > @@ -713,6 +714,7 @@ process_incoming_migration_co(void *opaque) > } > > mis->bh = qemu_bh_new(process_incoming_migration_bh, mis); > + object_ref(OBJECT(migrate_get_current())); > qemu_bh_schedule(mis->bh); > return; > fail: > -- > 2.35.3 >
I know I missed something, but I'd better ask: use-after-free needs to happen only after migration_shutdown() / qemu_cleanup(), am I right? If so, shouldn't qemu_main_loop() already returned? Then how could any BH keep running (including migration's) without qemu_main_loop()? -- Peter Xu
