On Mon, Mar 03, 2025 at 05:13:56PM +0530, Prasad Pandit wrote:
> Hello Peter,
> 
> On Fri, 28 Feb 2025 at 19:13, Peter Xu <pet...@redhat.com> wrote:
> > We should be able to do multifd's flush and sync before VM
> > stopped in postcopy_start()..
> >
> > What I actually think the easiest is to do flush and sync once in
> > postcopy_start() as mentioned above.  I think that'll serialize with
> > postcopy messages later on the main channel, making sure all channels are
> > flushed before moving to postcopy work.
> 
> * Is there some specific benefit to calling
> 'multifd_ram_flush_and_sync()' from OR before postcopy_start()? Maybe
> that is missing on me.
> 
> * I'll try to explain why adding a migration command makes sense:
> 
>    - I did call 'multifd_ram_flush_and_sync()' before calling
> postcopy_start() in migration_iteration_run(). After flushing the
> 'multifd_send' queue, all it does is send
> 'RAM_SAVE_FLAG_MULTIFD_FLUSH' flag on the main channel. On the
> destination side the 'qemu_loadvm_state_main()' function does not
> understand that message, because it looks for 'section_type'; And when
> it is not able to identify the section, it leads to an error.
> 
>           "Expected vmdescription section, but got %d", section_type(=0)"
> 
>    - ie. multifd_ram_flush_and_sync() is not reusable by itself. To
> make it work, we need to add a (RAM) section header to the message.
> Because then it'll go to ram_load_precopy() and call ->
> multifd_recv_sync_main().
> 
>    - But sending a lone RAM section header from
> migration_iteration_run() OR even in postcopy_start() does not seem
> right. That is out-of-place, because both migration_iteration_run()
> and postcopy_start() have nothing to do with RAM sections.
> 
>    - I think  'flush' and 'sync' ought to be small individual
> functions/operations that are section agnostic. We should be able to
> call 'flush' and 'sync' from anywhere in the code without
> side-effects. So tying 'flush' and 'sync' with RAM sections does not
> seem right, because we need to be able to call 'flush' and 'sync' from
> other parts too, like before calling postcopy_start() OR maybe some
> other code parts.

We need the header.  Maybe the easiest as of now is one more hook like
qemu_savevm_state_complete_precopy_prepare(), and only use it in RAM as of
now.

> 
>    - Another observation is: when multifd and postcopy are enabled
> together and the guest VM is writing to its RAM,
> multifd_ram_flush_and_sync() is called only during setup, not after
> that.
> =====
> 2025-03-02 18:13:26.425+0000: initiating migration
> 2025-03-02T18:13:26.508809Z qemu-system-x86_64:
> multifd_ram_flush_and_sync: called
> 2025-03-02 18:15:20.070+0000: shutting down, reason=migrated
> 
> 2025-03-03 11:26:41.340+0000: initiating migration
> 2025-03-03T11:26:41.415625Z qemu-system-x86_64: info:
> multifd_ram_flush_and_sync: called
> 2025-03-03 11:30:22.766+0000: shutting down, reason=migrated
> =====
> 
>    - If we untie the 'flush and sync' from RAM sections and make it a
> general migration command, we shall be able to call it from anywhere
> else.
> 
> > If you want to stick with shutdown channels, I'm OK.  But then we at least
> > need one message to say "the recv side finished joining all recv threads".
> > That needs to be an ACK sent from dest to src, not src to dest.
> 
> * But earlier we discussed 'flush and sync' is enough for that, no?

Yes it's ok I think, but this patch didn't do that.

+            multifd_send_flush();
+            multifd_send_sync_main(MULTIFD_SYNC_LOCAL);
+            qemu_savevm_send_multifd_recv_sync(s->to_dst_file);

I don't think it sent RAM_SAVE_FLAG_MULTIFD_FLUSH.  IIUC you need the
complete multifd_ram_flush_and_sync(), and the new message not needed.

> Because 'multifd_ram_flush_and_sync' only notifies the destination to
> sync multifd_recv threads. It does not receive any acknowledgement
> back.

If my understanding is correct, we don't need ACK.  It will make sure when
postcopy starts all pages flushed and consumed on dest.

Instead of I prepare the patch and whole commit message, please take your
time and think about it, justify it, and if you also think it works put
explanation into commit message and then we can go with it.

> 
> * And multifd_recv_sync_main() function on the destination blocks the
> 'main' thread until all multfd_recv_threads (mig/dst/recv_x) have
> exited, only then it proceeds to accept the incoming new postcopy
> connection.

I don't think it makes sure threads have exited, AFAIU it does not join the
threads, but sync with threads on the per-thread sync messages.

Thanks,

> 
> 
> Thank you.
> ---
>   - Prasad
> 

-- 
Peter Xu


Reply via email to