On Tue, Jan 24, 2023 at 04:36:20PM -0500, Peter Xu wrote: > On Tue, Jan 24, 2023 at 01:20:37PM +0000, Dr. David Alan Gilbert wrote: > > > @@ -3970,7 +3984,8 @@ int ram_load_postcopy(QEMUFile *f, int channel) > > > break; > > > } > > > tmp_page->target_pages++; > > > - matches_target_page_size = block->page_size == > > > TARGET_PAGE_SIZE; > > > + matches_target_page_size = > > > + migration_ram_pagesize(block) == TARGET_PAGE_SIZE; > > > /* > > > * Postcopy requires that we place whole host pages > > > atomically; > > > * these may be huge pages for RAMBlocks that are backed by > > > > Hmm do you really want this change? > > Yes that's intended. I want to reuse the same logic here when receiving > small pages from huge pages, just like when we're receiving small pages on > non-hugetlb mappings. > > matches_target_page_size majorly affects two things: > > 1) For a small zero page, whether we want to pre-set the page_buffer, or > simply use postcopy_place_page_zero(): > > case RAM_SAVE_FLAG_ZERO: > ch = qemu_get_byte(f); > /* > * Can skip to set page_buffer when > * this is a zero page and (block->page_size == TARGET_PAGE_SIZE). > */ > if (ch || !matches_target_page_size) { > memset(page_buffer, ch, TARGET_PAGE_SIZE); > } > > 2) For normal page, whether we need to use a page buffer or we can > directly reuse the page buffer in QEMUFile: > > if (!matches_target_page_size) { > /* For huge pages, we always use temporary buffer */ > qemu_get_buffer(f, page_buffer, TARGET_PAGE_SIZE); > } else { > /* > * For small pages that matches target page size, we > * avoid the qemu_file copy. Instead we directly use > * the buffer of QEMUFile to place the page. Note: we > * cannot do any QEMUFile operation before using that > * buffer to make sure the buffer is valid when > * placing the page. > */ > qemu_get_buffer_in_place(f, (uint8_t **)&place_source, > TARGET_PAGE_SIZE); > } > > Here: > > I want 1) to reuse postcopy_place_page_zero(). For the doublemap case, > it'll reuse postcopy_tmp_zero_page() (because qemu_ram_is_uf_zeroable() > will return false for such a ramblock). > > I want 2) to reuse qemu_get_buffer_in_place(), so we avoid a copy process > for the small page which is faster (even if it's hugetlb backed, now we can > reuse the qemufile buffer safely).
Since at it, one more thing worth mentioning is I didn't actually know whether the original code is always correct when target and host small psizes don't match.. This is the original line: matches_target_page_size = block->page_size == TARGET_PAGE_SIZE; The problem is we're comparing block page size against target page size, however block page size should be in host page size granule: RAMBlock *qemu_ram_alloc_internal() { new_block->page_size = qemu_real_host_page_size(); IOW, I am not sure whether postcopy will run at all in that case. For example, when we run an Alpha emulator upon x86_64, we can have target psize 8K while host psize 4K. The migration protocol should be TARGET_PAGE_SIZE based. It means, for postcopy when receiving a single page for Alpha VM being migrated, maybe we should call UFFDIO_COPY (or UFFDIO_CONTINUE; doesn't matter here) twice because one guest page contains two host pages. I'm not sure whether I get all these right.. if so, we have two options: a) Forbid postcopy as a whole when detecting qemu_real_host_page_size() != TARGET_PAGE_SIZE. b) Implement postcopy for that case I'd go with a) even if it's an issue because it means no one is migrating that thing in postcopy way in the past N years, so it justifies that maybe b) doesn't worth it. -- Peter Xu