On Wed, Jan 17, 2024 at 03:13:20PM -0300, Fabiano Rosas wrote: > >> @@ -3242,8 +3243,11 @@ static int ram_save_iterate(QEMUFile *f, void > >> *opaque) > >> out: > >> if (ret >= 0 > >> && migration_is_setup_or_active(migrate_get_current()->state)) { > >> - if (migrate_multifd() && > >> migrate_multifd_flush_after_each_section()) { > >> - ret = > >> multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel); > >> + if (migrate_multifd() && > >> + (migrate_multifd_flush_after_each_section() || > >> + migrate_fixed_ram())) { > >> + ret = multifd_send_sync_main( > >> + rs->pss[RAM_CHANNEL_PRECOPY].pss_channel); > > > > Why you want this one? ram_save_iterate() can be called tens for each > > second iiuc. > > > > AIUI, this is a requirement for live migration, so that we're not > sending the new version of the page while the old version is still in > transit. > > > There's one more? ram_save_complete(): > > > > if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) { > > qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); > > } > > > > IIUC that's the one you referred to at [1] above, not sure why you modified > > the code in ram_save_iterate() instead. > > > > I mentioned it in the commit message as well: > > " - between iterations, to avoid a slow channel being overrun by a fast > channel in the subsequent iteration;"
IMHO you only need to flush all threads in find_dirty_block(). That's when the "real iteration" happens (IOW, when the "real iteration" is defined to be a full walk across all guest memories, rather than one single call to ram_save_iterate()). Multifd used to do too many flushes, and that's why we had the new migrate_multifd_flush_after_each_section() and it's a bit of a mess if you see.. To be super safe, you can also flush at ram_save_complete(), but I doubt its necessity, and this same question applies to generic multifd: the multifd_send_sync_main() in the ram_save_complete() can be redundant, afaiu. However we can leave that for later to not add even more dependency to fixed-ram. If that is justified, we can remove the sync_main for both socket / file then. -- Peter Xu