On Wed, Jul 21, 2021 at 11:27:58AM +0200, David Hildenbrand wrote: > Currently, when someone (i.e., the VM) accesses discarded parts inside a > RAMBlock with a RamDiscardManager managing the corresponding mapped memory > region, postcopy will request migration of the corresponding page from the > source. The source, however, will never answer, because it refuses to > migrate such pages with undefined content ("logically unplugged"): the > pages are never dirty, and get_queued_page() will consequently skip > processing these postcopy requests. > > Especially reading discarded ("logically unplugged") ranges is supposed to > work in some setups (for example with current virtio-mem), although it > barely ever happens: still, not placing a page would currently stall the > VM, as it cannot make forward progress. > > Let's check the state via the RamDiscardManager (the state e.g., > of virtio-mem is migrated during precopy) and avoid sending a request > that will never get answered. Place a fresh zero page instead to keep > the VM working. This is the same behavior that would happen > automatically without userfaultfd being active, when accessing virtual > memory regions without populated pages -- "populate on demand". > > Signed-off-by: David Hildenbrand <da...@redhat.com> > --- > migration/postcopy-ram.c | 25 +++++++++++++++++++++---- > migration/ram.c | 23 +++++++++++++++++++++++ > migration/ram.h | 1 + > 3 files changed, 45 insertions(+), 4 deletions(-) > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c > index 2e9697bdd2..673f60ce2b 100644 > --- a/migration/postcopy-ram.c > +++ b/migration/postcopy-ram.c > @@ -671,6 +671,23 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, > return ret; > } > > +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb, > + ram_addr_t start, uint64_t haddr) > +{ > + /* > + * Discarded pages (via RamDiscardManager) are never migrated. On > unlikely > + * access, place a zeropage, which will also set the relevant bits in the > + * recv_bitmap accordingly, so we won't try placing a zeropage twice. > + */ > + if (ramblock_page_is_discarded(rb, start)) { > + bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);
Will received be set for any case with the current code base? As I thought virtio-mem forbids plug/unplug during the whole lifecycle of migration. > + > + return received ? 0 : postcopy_place_page_zero(mis, (void *)haddr, > rb); (now we can fill up pages in two threads.. but looks thread-safe) Meanwhile if this is highly not wanted, maybe worth an error_report_once() so the admin could see something? > + } > + > + return migrate_send_rp_req_pages(mis, rb, start, haddr); > +} > + > /* > * Callback from shared fault handlers to ask for a page, > * the page must be specified by a RAMBlock and an offset in that rb > @@ -690,7 +707,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, > RAMBlock *rb, > qemu_ram_get_idstr(rb), rb_offset); > return postcopy_wake_shared(pcfd, client_addr, rb); > } > - migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr); > + postcopy_request_page(mis, rb, aligned_rbo, client_addr); > return 0; > } > > @@ -984,8 +1001,8 @@ retry: > * Send the request to the source - we want to request one > * of our host page sizes (which is >= TPS) > */ > - ret = migrate_send_rp_req_pages(mis, rb, rb_offset, > - msg.arg.pagefault.address); > + ret = postcopy_request_page(mis, rb, rb_offset, > + msg.arg.pagefault.address); > if (ret) { > /* May be network failure, try to wait for recovery */ > if (ret == -EIO && postcopy_pause_fault_thread(mis)) { > @@ -993,7 +1010,7 @@ retry: > goto retry; > } else { > /* This is a unavoidable fault */ > - error_report("%s: migrate_send_rp_req_pages() get %d", > + error_report("%s: postcopy_request_page() get %d", > __func__, ret); > break; > } > diff --git a/migration/ram.c b/migration/ram.c > index 4bf74ae2e1..d7505f5368 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -876,6 +876,29 @@ static uint64_t > ramblock_dirty_bitmap_exclude_discarded_pages(RAMBlock *rb) > return cleared_bits; > } > > +/* > + * Check if a page falls into a discarded range as managed by a > + * RamDiscardManager responsible for the mapped memory region of the > RAMBlock. > + * Pages inside discarded ranges are never migrated and postcopy has to > + * place zeropages instead. > + * > + * Note: The result is only stable while migration (precopy/postcopy). > + */ > +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset) > +{ > + if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) { > + RamDiscardManager *rdm = > memory_region_get_ram_discard_manager(rb->mr); > + MemoryRegionSection section = { > + .mr = rb->mr, > + .offset_within_region = offset, > + .size = int128_get64(TARGET_PAGE_SIZE), rb->page_size? Although I think it should also work with TARGET_PAGE_SIZE in this specific case, but maybe still better to use what we have. > + }; > + > + return !ram_discard_manager_is_populated(rdm, §ion); > + } > + return false; > +} > + > /* Called with RCU critical section */ > static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb) > { > diff --git a/migration/ram.h b/migration/ram.h > index 4833e9fd5b..6e7f12e556 100644 > --- a/migration/ram.h > +++ b/migration/ram.h > @@ -72,6 +72,7 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void > *host_addr, size_t nr); > int64_t ramblock_recv_bitmap_send(QEMUFile *file, > const char *block_name); > int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb); > +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t offset); > > /* ram cache */ > int colo_init_ram_cache(void); > -- > 2.31.1 > -- Peter Xu