Peter Xu <pet...@redhat.com> writes:

> On Tue, Feb 20, 2024 at 07:41:26PM -0300, Fabiano Rosas wrote:
>> The fixed-ram migration can be performed live or non-live, but it is
>> always asynchronous, i.e. the source machine and the destination
>> machine are not migrating at the same time. We only need some pieces
>> of the multifd sync operations.
>> 
>> multifd_send_sync_main()
>> ------------------------
>>   Issued by the ram migration code on the migration thread, causes the
>>   multifd send channels to synchronize with the migration thread and
>>   makes the sending side emit a packet with the MULTIFD_FLUSH flag.
>> 
>>   With fixed-ram we want to maintain the sync on the sending side
>>   because that provides ordering between the rounds of dirty pages when
>>   migrating live.
>> 
>> MULTIFD_FLUSH
>> -------------
>>   On the receiving side, the presence of the MULTIFD_FLUSH flag on a
>>   packet causes the receiving channels to start synchronizing with the
>>   main thread.
>> 
>>   We're not using packets with fixed-ram, so there's no MULTIFD_FLUSH
>>   flag and therefore no channel sync on the receiving side.
>> 
>> multifd_recv_sync_main()
>> ------------------------
>>   Issued by the migration thread when the ram migration flag
>>   RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread
>>   on the receiving side to start synchronizing with the recv
>>   channels. Due to compatibility, this is also issued when
>>   RAM_SAVE_FLAG_EOS is received.
>> 
>>   For fixed-ram we only need to synchronize the channels at the end of
>>   migration to avoid doing cleanup before the channels have finished
>>   their IO.
>> 
>> Make sure the multifd syncs are only issued at the appropriate
>> times. Note that due to pre-existing backward compatibility issues, we
>> have the multifd_flush_after_each_section property that enables an
>> older behavior of synchronizing channels more frequently (and
>> inefficiently). Fixed-ram should always run with that property
>> disabled (default).
>
> What if the user enables multifd_flush_after_each_section=true?
>
> IMHO we don't necessarily need to attach the fixed-ram loading flush to any
> flag in the stream.  For fixed-ram IIUC all the loads will happen in one
> shot of ram_load() anyway when parsing the ramblock list, so.. how about we
> decouple the fixed-ram load flush from the stream by always do a sync in
> ram_load() unconditionally?

I would like to. But it's not possible because ram_load() is called once
per section. So once for each EOS flag on the stream. We'll have at
least two calls to ram_load(), once due to qemu_savevm_state_iterate()
and another due to qemu_savevm_state_complete_precopy().

The fact that fixed-ram can use just one load doesn't change the fact
that we perform more than one "save". So we'll need to use the FLUSH
flag in this case unfortunately.

>
> @@ -4368,6 +4367,15 @@ static int ram_load(QEMUFile *f, void *opaque, int 
> version_id)
>              ret = ram_load_precopy(f);
>          }
>      }
> +
> +    /*
> +     * Fixed-ram migration may queue load tasks to multifd threads; make
> +     * sure they're all done.
> +     */
> +    if (migrate_fixed_ram() && migrate_multifd()) {
> +        multifd_recv_sync_main();
> +    }
> +
>      trace_ram_load_complete(ret, seq_iter);
>  
>      return ret;
>
> Then ram_load() always guarantees synchronous loading of pages, and
> fixed-ram will completely ignore multifd flushes (then we also skip it for
> the ram_save_complete() like what this patch does for the rest).
>
>> 
>> Signed-off-by: Fabiano Rosas <faro...@suse.de>
>> ---
>>  migration/ram.c | 19 ++++++++++++++++---
>>  1 file changed, 16 insertions(+), 3 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 5932e1b8e1..c7050f6f68 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1369,8 +1369,11 @@ static int find_dirty_block(RAMState *rs, 
>> PageSearchStatus *pss)
>>                  if (ret < 0) {
>>                      return ret;
>>                  }
>> -                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> -                qemu_fflush(f);
>> +
>> +                if (!migrate_fixed_ram()) {
>> +                    qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>> +                    qemu_fflush(f);
>> +                }
>>              }
>>              /*
>>               * If memory migration starts over, we will meet a dirtied page
>> @@ -3112,7 +3115,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>          return ret;
>>      }
>>  
>> -    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
>> +    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()
>> +        && !migrate_fixed_ram()) {
>>          qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
>>      }
>>  
>> @@ -4253,6 +4257,15 @@ static int ram_load_precopy(QEMUFile *f)
>>              break;
>>          case RAM_SAVE_FLAG_EOS:
>>              /* normal exit */
>> +            if (migrate_fixed_ram()) {
>> +                /*
>> +                 * The EOS flag appears multiple times on the
>> +                 * stream. Fixed-ram needs only one sync at the
>> +                 * end. It will be done on the flush flag above.
>> +                 */
>> +                break;
>> +            }
>> +
>>              if (migrate_multifd() &&
>>                  migrate_multifd_flush_after_each_section()) {
>>                  multifd_recv_sync_main();
>> -- 
>> 2.35.3
>> 

Reply via email to