> > This patch should come before 1/3, otherwise it'll break bisect. > We could squash the two together, IMHO.
* It is adjusting the specific optimisation behaviour for the use case of when Multifd and Postcopy are enabled together. I think it's better as a separate patch. It'll help to see how that optimization changed/evolved over time. > > s/ones/once/ > > > > > + * > > > + * It becomes a problem when both Multifd & Postcopy options are > > > + * enabled. If the zero page which was skipped during multifd > > > phase, > > > + * is accessed during the Postcopy phase of the migration, a page > > > + * fault occurs. But this page fault is not served because the > > > + * 'receivedmap' says the zero page is already received. Thus the > > > + * migration hangs. > > More accurate version could be: "the thread accessing the page may hang". > As discussed previously, in most cases IIUC it won't hang migration when > accessed in vcpu contexts, and will move again when all pages migrated > (triggers uffd unregistrations). * Okay. > Meanwhile when at it.. for postcopy if we want we don't need to set all > zeros.. just fault it in either using one inst. Summary: > > void multifd_recv_zero_page_process(MultiFDRecvParams *p) > { > bool received; > > for (int i = 0; i < p->zero_num; i++) { > void *page = p->host + p->zero[i]; > > received = ramblock_recv_bitmap_test_byte_offset(p->block, > p->zero[i]); > if (!received) { > ramblock_recv_bitmap_set_offset(p->block, p->zero[i]); > } * Okay. > if (received) { > /* If it has an older version, we must clear the whole page */ > memset(page, 0, multifd_ram_page_size()); > } else if (migrate_postcopy_ram()) { > /* > * If postcopy is enabled, we must fault in the page because > * XXX (please fill in..). Here we don't necessarily need to > * zero the whole page because we know it must be pre-filled > * with zeros anyway. > */ > *(uint8_t *)page = 0; > > We could also use MADV_POPULATE_WRITE but not sure which one is faster, and > this might still be easier to follow anyway.. * Not sure how this is to work; During Multifd phase (Postcopy not running), when migrate_postcopy_ram() returns true, we shall raise a page fault here? * Could we zero-initialise the destination guest memory when migration starts? And not migrate the zero pages from the source at all? ie. we mark the page received in the 'receivedmap' as is done now, but page fault should also not happen for that guest address, because the memory was already zero-initialised at the beginning. I think there might be some scope to send zero-page entries piggy-backed with non-zero pages, whose contents are migrated anyway. * Say there are 10 pages (4KB each, Total: 40KB). Of these 10 pages: Non-zero pages: 1, 2, 4, 7, 9, 10 Zero Pages: 3, 5-6 and 8 * We only migrate/send non-zero pages from source to the destination. When non-zero page-4 is migrated, an entry/hint of page-3 being zero one is piggy-backed with it. When non-zero page-7 is sent an entry/hint of pages-5-6 being zero pages is sent with it. Similarly a hint of page-8 being zero page is sent along with page-9. (thinking aloud) Thank you. --- - Prasad