Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase

William Roche Wed, 06 Sep 2023 14:30:55 -0700

On 9/6/23 17:16, Peter Xu wrote:


Just a note..

Probably fine for now to reuse block page size, but IIUC the right thing to
do is to fetch it from the signal info (in QEMU's sigbus_handler()) of
kernel_siginfo.si_addr_lsb.

At least for x86 I think that stores the "shift" of covered poisoned page
(one needs to track the Linux handling of VM_FAULT_HWPOISON_LARGE for a
huge page, though.. not aware of any man page for that).  It'll then work
naturally when Linux huge pages will start to support sub-huge-page-size
poisoning someday.  We can definitely leave that for later.


I totally agree with that !

--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1145,7 +1145,8 @@ static int save_zero_page_to_file(PageSearchStatus *pss, 
QEMUFile *file,
      uint8_t *p = block->host + offset;
      int len = 0;

- if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {

+    if ((kvm_enabled() && kvm_hwpoisoned_page(block, (void *)offset)) ||


Can we move this out of zero page handling?  Zero detection is not
guaranteed to always be the 1st thing to do when processing a guest page.
Currently it'll already skip either rdma or when compression enabled, so
it'll keep crashing there.

Perhaps at the entry of ram_save_target_page_legacy()?

Right, as expected, using migration compression with poisoned pagescrashes even with this fix...


The difficulty I see to place the poisoned page verification on the

entry of ram_save_target_page_legacy() is what to do to skip the foundpoison page(s) if any ?

Should I continue to treat them as zero pages written withsave_zero_page_to_file ? Or should I consider the case of an ongoingcompression use and create a new code compressing an empty page withsave_compress_page() ?


And what about an RDMA memory region impacted by a memory error ?
This is an important aspect.

Does anyone know how this situation is dealt with ? And how it should behandled in Qemu ?


--
Thanks,
William.

Re: [PATCH 1/1] migration: skip poisoned memory pages on "ram saving" phase

Reply via email to