On Thu, May 08, 2025 at 10:57:19AM -0300, Fabiano Rosas wrote:
> Prasad Pandit <ppan...@redhat.com> writes:
> 
> > From: Prasad Pandit <p...@fedoraproject.org>
> >
> > During multifd migration, zero pages are are written if
> > they are migrated more than ones.
> 
> s/ones/once/
> 
> >
> > This may result in a migration hang issue when Multifd
> > and Postcopy are enabled together.
> >
> > When Postcopy is enabled, always write zero pages as and
> > when they are migrated.
> >
> > Signed-off-by: Prasad Pandit <p...@fedoraproject.org>
> 
> This patch should come before 1/3, otherwise it'll break bisect.

We could squash the two together, IMHO.

> 
> > ---
> >  migration/multifd-zero-page.c | 22 ++++++++++++++++++++--
> >  1 file changed, 20 insertions(+), 2 deletions(-)
> >
> > v10: new patch, not present in v9 or earlier versions.
> >
> > diff --git a/migration/multifd-zero-page.c b/migration/multifd-zero-page.c
> > index dbc1184921..9bfb3ef803 100644
> > --- a/migration/multifd-zero-page.c
> > +++ b/migration/multifd-zero-page.c
> > @@ -85,9 +85,27 @@ void multifd_recv_zero_page_process(MultiFDRecvParams *p)
> >  {
> >      for (int i = 0; i < p->zero_num; i++) {
> >          void *page = p->host + p->zero[i];
> > -        if (ramblock_recv_bitmap_test_byte_offset(p->block, p->zero[i])) {
> > +
> > +        /*
> > +         * During multifd migration zero page is written to the memory
> > +         * only if it is migrated more than ones.
> 
> s/ones/once/
> 
> > +         *
> > +         * It becomes a problem when both Multifd & Postcopy options are
> > +         * enabled. If the zero page which was skipped during multifd 
> > phase,
> > +         * is accessed during the Postcopy phase of the migration, a page
> > +         * fault occurs. But this page fault is not served because the
> > +         * 'receivedmap' says the zero page is already received. Thus the
> > +         * migration hangs.

More accurate version could be: "the thread accessing the page may hang".
As discussed previously, in most cases IIUC it won't hang migration when
accessed in vcpu contexts, and will move again when all pages migrated
(triggers uffd unregistrations).

> > +         *
> > +         * When Postcopy is enabled, always write the zero page as and when
> > +         * it is migrated.
> > +         *
> 
> extra blank line here^
> 
> > +         */
> 
> nit: Inconsistent use of capitalization for the feature names. I'd keep
> it all lowercase.
> 
> > +        if (migrate_postcopy_ram() ||
> > +            ramblock_recv_bitmap_test_byte_offset(p->block, p->zero[i])) {
> >              memset(page, 0, multifd_ram_page_size());
> > -        } else {
> > +        }
> > +        if (!ramblock_recv_bitmap_test_byte_offset(p->block, p->zero[i])) {
> >              ramblock_recv_bitmap_set_offset(p->block, p->zero[i]);
> >          }

Nitpick below: we could avoid checking the bitmap twice, and maybe move it
a bit is easier to read.

Meanwhile when at it.. for postcopy if we want we don't need to set all
zeros.. just fault it in either using one inst.  Summary:

void multifd_recv_zero_page_process(MultiFDRecvParams *p)
{
    bool received;

    for (int i = 0; i < p->zero_num; i++) {
        void *page = p->host + p->zero[i];

        received = ramblock_recv_bitmap_test_byte_offset(p->block, p->zero[i]);
        if (!received) {
            ramblock_recv_bitmap_set_offset(p->block, p->zero[i]);
        }

        if (received) {
            /* If it has an older version, we must clear the whole page */
            memset(page, 0, multifd_ram_page_size());
        } else if (migrate_postcopy_ram()) {
            /*
             * If postcopy is enabled, we must fault in the page because
             * XXX (please fill in..).  Here we don't necessarily need to
             * zero the whole page because we know it must be pre-filled
             * with zeros anyway.
             */
            *(uint8_t *)page = 0;
        }
    }
}

We could also use MADV_POPULATE_WRITE but not sure which one is faster, and
this might still be easier to follow anyway..

-- 
Peter Xu


Reply via email to