Peter Xu <pet...@redhat.com> writes: > On Tue, Feb 20, 2024 at 07:41:26PM -0300, Fabiano Rosas wrote: >> The fixed-ram migration can be performed live or non-live, but it is >> always asynchronous, i.e. the source machine and the destination >> machine are not migrating at the same time. We only need some pieces >> of the multifd sync operations. >> >> multifd_send_sync_main() >> ------------------------ >> Issued by the ram migration code on the migration thread, causes the >> multifd send channels to synchronize with the migration thread and >> makes the sending side emit a packet with the MULTIFD_FLUSH flag. >> >> With fixed-ram we want to maintain the sync on the sending side >> because that provides ordering between the rounds of dirty pages when >> migrating live. >> >> MULTIFD_FLUSH >> ------------- >> On the receiving side, the presence of the MULTIFD_FLUSH flag on a >> packet causes the receiving channels to start synchronizing with the >> main thread. >> >> We're not using packets with fixed-ram, so there's no MULTIFD_FLUSH >> flag and therefore no channel sync on the receiving side. >> >> multifd_recv_sync_main() >> ------------------------ >> Issued by the migration thread when the ram migration flag >> RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread >> on the receiving side to start synchronizing with the recv >> channels. Due to compatibility, this is also issued when >> RAM_SAVE_FLAG_EOS is received. >> >> For fixed-ram we only need to synchronize the channels at the end of >> migration to avoid doing cleanup before the channels have finished >> their IO. >> >> Make sure the multifd syncs are only issued at the appropriate >> times. Note that due to pre-existing backward compatibility issues, we >> have the multifd_flush_after_each_section property that enables an >> older behavior of synchronizing channels more frequently (and >> inefficiently). Fixed-ram should always run with that property >> disabled (default). > > What if the user enables multifd_flush_after_each_section=true? > > IMHO we don't necessarily need to attach the fixed-ram loading flush to any > flag in the stream. For fixed-ram IIUC all the loads will happen in one > shot of ram_load() anyway when parsing the ramblock list, so.. how about we > decouple the fixed-ram load flush from the stream by always do a sync in > ram_load() unconditionally?
I would like to. But it's not possible because ram_load() is called once per section. So once for each EOS flag on the stream. We'll have at least two calls to ram_load(), once due to qemu_savevm_state_iterate() and another due to qemu_savevm_state_complete_precopy(). The fact that fixed-ram can use just one load doesn't change the fact that we perform more than one "save". So we'll need to use the FLUSH flag in this case unfortunately. > > @@ -4368,6 +4367,15 @@ static int ram_load(QEMUFile *f, void *opaque, int > version_id) > ret = ram_load_precopy(f); > } > } > + > + /* > + * Fixed-ram migration may queue load tasks to multifd threads; make > + * sure they're all done. > + */ > + if (migrate_fixed_ram() && migrate_multifd()) { > + multifd_recv_sync_main(); > + } > + > trace_ram_load_complete(ret, seq_iter); > > return ret; > > Then ram_load() always guarantees synchronous loading of pages, and > fixed-ram will completely ignore multifd flushes (then we also skip it for > the ram_save_complete() like what this patch does for the rest). > >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de> >> --- >> migration/ram.c | 19 ++++++++++++++++--- >> 1 file changed, 16 insertions(+), 3 deletions(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index 5932e1b8e1..c7050f6f68 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -1369,8 +1369,11 @@ static int find_dirty_block(RAMState *rs, >> PageSearchStatus *pss) >> if (ret < 0) { >> return ret; >> } >> - qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); >> - qemu_fflush(f); >> + >> + if (!migrate_fixed_ram()) { >> + qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); >> + qemu_fflush(f); >> + } >> } >> /* >> * If memory migration starts over, we will meet a dirtied page >> @@ -3112,7 +3115,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque) >> return ret; >> } >> >> - if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) { >> + if (migrate_multifd() && !migrate_multifd_flush_after_each_section() >> + && !migrate_fixed_ram()) { >> qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); >> } >> >> @@ -4253,6 +4257,15 @@ static int ram_load_precopy(QEMUFile *f) >> break; >> case RAM_SAVE_FLAG_EOS: >> /* normal exit */ >> + if (migrate_fixed_ram()) { >> + /* >> + * The EOS flag appears multiple times on the >> + * stream. Fixed-ram needs only one sync at the >> + * end. It will be done on the flush flag above. >> + */ >> + break; >> + } >> + >> if (migrate_multifd() && >> migrate_multifd_flush_after_each_section()) { >> multifd_recv_sync_main(); >> -- >> 2.35.3 >>