Peter Xu <pet...@redhat.com> writes:

> On Thu, Aug 22, 2024 at 02:05:30PM -0300, Fabiano Rosas wrote:
>> Peter Xu <pet...@redhat.com> writes:
>> 
>> > On Thu, Aug 22, 2024 at 12:03:47PM -0400, Peter Xu wrote:
>> >> On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote:
>> >> > Separate the multifd sync from flushing the client data to the
>> >> > channels. These two operations are closely related but not strictly
>> >> > necessary to be executed together.
>> >> > 
>> >> > The multifd sync is intrinsic to how multifd works. The multiple
>> >> > channels operate independently and may finish IO out of order in
>> >> > relation to each other. This applies also between the source and
>> >> > destination QEMU.
>> >> > 
>> >> > Flushing the data that is left in the client-owned data structures
>> >> > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do,
>> >> > but that is particular to how the ram migration is implemented with
>> >> > several passes over dirty data.
>> >> > 
>> >> > Make these two routines separate, allowing future code to call the
>> >> > sync by itself if needed. This also allows the usage of
>> >> > multifd_ram_send to be isolated to ram code.
>> >> 
>> >> What's the usage of sync but not flush here?
>> >
>> > Oh I think I see your point.. I think flush+sync is always needed, it's
>> > just that RAM may not always be the one to flush, but something else.
>> > Makes sense then.
>> >
>> 
>> I'm thinking of "flush" here as a last multifd_send() before sync. We
>> need multiple multifd_send() along the migration to send the data, but
>> we might not need this extra flush. It could be that there's nothing to
>> flush and the code guarantees it:
>> 
>>  <populate MultiFDSendData>
>>  multifd_send()
>>  sync
>> 
>> Where RAM currently does:
>> 
>>  multifd_queue_page()
>>  multifd_queue_page()
>>  multifd_queue_page()
>>  ...
>>  multifd_queue_page()
>>  multifd_send()
>>  sync
>> 
>> Today there is a multifd_send() inside multifd_queue_page() and the
>> amount sent depends on the ram.c code. At the time sync gets called,
>> there could be data queued but not yet sent. Another client (not ram)
>> could just produce data in a deterministic manner and match that with
>> calls to multifd_send().
>
> I hope I read it alright.. I suppose you meant we have chance to do:
>
>   ram_send()
>   vfio_send()
>   flush()
>
> Instead of:
>
>   ram_send()
>   flush()
>   vfio_send()
>   flush()
>
> Am I right?

Not really. I'm saying that RAM doesn't always send the data, that's why
it needs a final flush before sync:

multifd_queue_page()
    if (multifd_queue_empty(pages)) {
        multifd_enqueue(pages, offset);
    }
    
    if (multifd_queue_full(pages)) {
        multifd_send_pages()   <-- this might not happen
    }
    multifd_enqueue()

multifd_send_sync_main()
    if (pages->num) { <-- data left unsent
       multifd_send() <-- flush
    }
    <sync routine>

>
>> 
>> > If you want, you may touch up the commit message to clarify that.  E.g. I
>> > still don't see any use case that we want to sync without a flush, that
>> > part might be a bit ambiguous.
>> >
>> > If my understanding is correct, take this:
>> >
>> > Reviewed-by: Peter Xu <pet...@redhat.com>
>> 

Reply via email to