* Peter Xu (pet...@redhat.com) wrote: > v6: > - fix page mask to use ramblock psize [Dave] > > v5: > - added one test patch for easier debugging for migration-test > - added one fix patch [1] for another postcopy race > - fixed a bug that could trigger when host/guest page size differs > > v4: > - use "void */ulong" instead of "uint64_t" where proper in patch 3/4 [Dave] > > v3: > - fix build on 32bit hosts & rebase > - remove r-bs for the last 2 patches for Dave due to the changes > > v2: > - add r-bs for Dave > - add patch "migration: Properly destroy variables on incoming side" as patch > 1 > - destroy page_request_mutex in migration_incoming_state_destroy() too [Dave] > - use WITH_QEMU_LOCK_GUARD in two places where we can [Dave] > > We've seen conditional guest hangs on destination VM after postcopy recovered. > However the hang will resolve itself after a few minutes. > > The problem is: after a postcopy recovery, the prioritized postcopy queue on > the source VM is actually missing. So all the faulted threads before the > postcopy recovery happened will keep halted until (accidentally) the page got > copied by the background precopy migration stream. > > The solution is to also refresh this information after postcopy recovery. To > achieve this, we need to maintain a list of faulted addresses on the > destination node, so that we can resend the list when necessary. This work is > done via patch 2-5. > > With that, the last thing we need to do is to send this extra information to > source VM after recovered. Very luckily, this synchronization can be > "emulated" by sending a bunch of page requests (although these pages have been > sent previously!) to source VM just like when we've got a page fault. Even in > the 1st version of the postcopy code we'll handle duplicated pages well. So > this fix does not even need a new capability bit and it'll work smoothly on > old > QEMUs when we migrate from them to the new QEMUs. > > Please review, thanks.
Queued Dave > > Peter Xu (6): > migration: Pass incoming state into qemu_ufd_copy_ioctl() > migration: Introduce migrate_send_rp_message_req_pages() > migration: Maintain postcopy faulted addresses > migration: Sync requested pages after postcopy recovery > migration/postcopy: Release fd before going into 'postcopy-pause' > migration-test: Only hide error if !QTEST_LOG > > migration/migration.c | 55 ++++++++++++++++++++++++++++++---- > migration/migration.h | 21 ++++++++++++- > migration/postcopy-ram.c | 25 ++++++++++++---- > migration/savevm.c | 57 ++++++++++++++++++++++++++++++++++++ > migration/trace-events | 3 ++ > tests/qtest/migration-test.c | 6 +++- > 6 files changed, 154 insertions(+), 13 deletions(-) > > -- > 2.26.2 > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK