Hello Peter, On Fri, 28 Feb 2025 at 19:13, Peter Xu <pet...@redhat.com> wrote: > We should be able to do multifd's flush and sync before VM > stopped in postcopy_start().. > > What I actually think the easiest is to do flush and sync once in > postcopy_start() as mentioned above. I think that'll serialize with > postcopy messages later on the main channel, making sure all channels are > flushed before moving to postcopy work.
* Is there some specific benefit to calling 'multifd_ram_flush_and_sync()' from OR before postcopy_start()? Maybe that is missing on me. * I'll try to explain why adding a migration command makes sense: - I did call 'multifd_ram_flush_and_sync()' before calling postcopy_start() in migration_iteration_run(). After flushing the 'multifd_send' queue, all it does is send 'RAM_SAVE_FLAG_MULTIFD_FLUSH' flag on the main channel. On the destination side the 'qemu_loadvm_state_main()' function does not understand that message, because it looks for 'section_type'; And when it is not able to identify the section, it leads to an error. "Expected vmdescription section, but got %d", section_type(=0)" - ie. multifd_ram_flush_and_sync() is not reusable by itself. To make it work, we need to add a (RAM) section header to the message. Because then it'll go to ram_load_precopy() and call -> multifd_recv_sync_main(). - But sending a lone RAM section header from migration_iteration_run() OR even in postcopy_start() does not seem right. That is out-of-place, because both migration_iteration_run() and postcopy_start() have nothing to do with RAM sections. - I think 'flush' and 'sync' ought to be small individual functions/operations that are section agnostic. We should be able to call 'flush' and 'sync' from anywhere in the code without side-effects. So tying 'flush' and 'sync' with RAM sections does not seem right, because we need to be able to call 'flush' and 'sync' from other parts too, like before calling postcopy_start() OR maybe some other code parts. - Another observation is: when multifd and postcopy are enabled together and the guest VM is writing to its RAM, multifd_ram_flush_and_sync() is called only during setup, not after that. ===== 2025-03-02 18:13:26.425+0000: initiating migration 2025-03-02T18:13:26.508809Z qemu-system-x86_64: multifd_ram_flush_and_sync: called 2025-03-02 18:15:20.070+0000: shutting down, reason=migrated 2025-03-03 11:26:41.340+0000: initiating migration 2025-03-03T11:26:41.415625Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-03 11:30:22.766+0000: shutting down, reason=migrated ===== - If we untie the 'flush and sync' from RAM sections and make it a general migration command, we shall be able to call it from anywhere else. > If you want to stick with shutdown channels, I'm OK. But then we at least > need one message to say "the recv side finished joining all recv threads". > That needs to be an ACK sent from dest to src, not src to dest. * But earlier we discussed 'flush and sync' is enough for that, no? Because 'multifd_ram_flush_and_sync' only notifies the destination to sync multifd_recv threads. It does not receive any acknowledgement back. * And multifd_recv_sync_main() function on the destination blocks the 'main' thread until all multfd_recv_threads (mig/dst/recv_x) have exited, only then it proceeds to accept the incoming new postcopy connection. Thank you. --- - Prasad