Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT

David Hildenbrand Thu, 08 Jul 2021 01:19:44 -0700

On 07.07.21 21:57, Michael S. Tsirkin wrote:

On Wed, Jul 07, 2021 at 09:47:31PM +0200, David Hildenbrand wrote:

On 07.07.21 21:19, Michael S. Tsirkin wrote:

On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote:

On 07.07.21 21:05, Michael S. Tsirkin wrote:

On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote:

Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:


1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
      and consequently won't release free pages back to the OS once
      migration finishes.

      The issue is that for postcopy, we won't do a final bitmap sync while
      the guest is stopped on the source and
      virtio_balloon_free_page_hint_notify() will only call
      virtio_balloon_free_page_done() on the source during
      PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
      the destination.

2) Once the VM touches a page on the destination that has been excluded
      from migration on the source via qemu_guest_free_page_hint() while
      postcopy is active, that thread will stall until postcopy finishes
      and all threads are woken up. (with older Linux kernels that won't
      retry faults when woken up via userfaultfd, we might actually get a
      SEGFAULT)

      The issue is that the source will refuse to migrate any pages that
      are not marked as dirty in the dirty bmap -- for example, because the
      page might just have been sent. Consequently, the faulting thread will
      stall, waiting for the page to be migrated -- which could take quite
      a while and result in guest OS issues.


OK so if source gets a request for a page which is not dirty
it does not respond immediately? Why not just teach it to
respond? It would seem that if destination wants a page we
should just give it to the destination ...


The source does not know if a page has already been sent (e.g., via the
background migration thread that moves all data over) vs. the page has not
been send because the page was hinted. This is the part where we'd need
additional tracking on the source to actually know that.

We must not send a page twice, otherwise bad things can happen when placing
pages that already have been migrated, because that scenario can easily
happen with ordinary postcopy (page has already been sent and we're dealing
with a stale request from the destination).


OK let me get this straight

A. source sends page
B. destination requests page
C. destination changes page
D. source sends page
E. destination overwrites page

this is what you are worried about right?


IIRC E. is with recent kernels:

E. placing the page fails with -EEXIST and postcopy migration fails

However, the man page (man ioctl_userfaultfd) doesn't describe what is
actually supposed to happen when double-placing. Could be that it's
"undefined behavior".

I did not try, though.


This is how it works today:

A. source sends page and marks it clean
B. destination requests page
C. destination receives page and places it
D. source ignores request as page is clean


If it's actually -EEXIST then we could just resend it
and teach destination to ignore -EEXIST errors right?


I think checking the received bitmap would be more robust.


Will actually make things a bit more robust: destination
handles its own consistency instead of relying on source.

TBH, I don't think having multiple copies of the same page in flight isneither a very good design, nor robust.

In an idea world, the destination would make sure to send a page onlyonce and the source would expect to receive a page only once. This iscurrently the case except for free page hinting, where a page might notbe sent as we're relying on the dirty bitmap to also track what has beenalready sent.

The destination does handle consistently right now by bailing out if itreceives the page twice (e.g., -EEXIST). In addition, we could consultthe received bitmap to make sure we really only receive stuff onceinstead of relying on undocumented userfaultfd behavior. On the source,we'd ideally have a "sent bitmap", but I'll really avoid introducing newbitmaps because that can't be the ultimate solution (dirty, clean,received, ...).

I just found the comment that describes the current design:migration/ram.c:get_queued_page()

"We're sending this page, and since it's postcopy nothing else willdirty it, and we must make sure it doesn't get sent again even if thisqueue request was received after the background search already sent it."


the fix is to mark page clean in A.
then in D to not send page if it's clean?

And the problem with hinting is this:

A. page is marked clean
B. destination requests page
C. destination changes page
D. source sends page <- does not happen, page is clean!
E. destination overwrites page


Simplified it's

A. page is marked clean by hinting code
B. destination requests page
D. source ignores request as page is clean
E. destination stalls until postcopy unregisters uffd


Some thoughts

1. We do have a a recv bitmap where we track received pages on the
destination (e.g., ramblock_recv_bitmap_test()), however we only use it to
avoid sending duplicate requests to the hypervisor AFAIKs, and don't check
it when placing pages.

2. Changing the migration behavior unconditionally on the source will break
migration to old QEMU binaries that cannot handle this change.


We can always make this depend on new machine types.


Yes, we could if we want to go down that path.

3. I think the current behavior is in place to make debugging easier. If
only a single instance of a page will ever be migrated from source to
destination, there cannot be silent data corruption. Further, we avoid
migrating unnecessarily pages twice.


Likely does not matter much for performance, it seems unlikely that
the race is all that common.

I cannot really tell, I'd guess it can happen but shouldn't happen toooften.

For now I'm going to keep it simple and ignore free page hints whilepostcopy has been enabled as suggested in the other discussion. Thatshould be the easy fix and if someone wants to optimize this scenarios,we can think about a proper way to make it fly (as I said, nobody seemedto really have cared in the past as it's mostly broken right now).


Thanks for the helpful discussion!

--
Thanks,

David / dhildenb

Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT

Reply via email to