On Fri, Nov 06, 2015 at 01:43:42PM +0000, Dr. David Alan Gilbert wrote: > * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: > > * Bharata B Rao (bharata....@gmail.com) wrote: > > > On Fri, Nov 6, 2015 at 2:39 PM, Dr. David Alan Gilbert > > > <dgilb...@redhat.com> wrote: > > > > * Bharata B Rao (bhar...@linux.vnet.ibm.com) wrote: > > > >> On Thu, Nov 05, 2015 at 06:10:27PM +0000, Dr. David Alan Gilbert (git) > > > >> wrote: > > > >> > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > > >> > > > > >> > This is the 9th cut of my version of postcopy. > > > >> > > > > >> > The userfaultfd linux kernel code is now in the upstream kernel > > > >> > tree, and so 4.3 can be used without modification. > > > >> > > > > >> > This qemu series can be found at: > > > >> > https://github.com/orbitfp7/qemu.git > > > >> > on the wp3-postcopy-v9 tag > > > >> > > > > >> > Testing status: > > > >> > * Tested heavily on x86 > > > >> > * Smoke tested on aarch64 (so it does work on different page sizes) > > > >> > > > >> Tested minimally on ppc64 with back and forth postcopy migration of > > > >> unloaded pseries guest within the localhost - works as expected. > > > >> > > > >> However I am seeing a failure in one case. I am not sure if this is > > > >> a user error or a real issue in postcopy migration. If I switch to > > > >> postcopy > > > >> migration immediately after starting the migration, I see the migration > > > >> failing with error: > > > >> > > > >> qemu-system-ppc64: qemu_savevm_send_packaged: Unreasonably large > > > >> packaged state: 25905005 > > > > > > > > I put an arbitrary limit of 16MB (see MAX_VM_CMD_PACKAGED_SIZE in > > > > include/sysemu/sysemu.h) > > > > on the size of the data accepted into the packaged blob. How big is > > > > the htab data likely to be? > > > > > > HTAB size is a variable and depends on maxmem size. It will be 1/128 > > > th of maxmem. So for a 32G guest, HTAB will be 256M in size. > > > > OK, that does get a bit big. > > Two possible fixes; > > 1 - postcopy htab (I don't know htab to know how hard that is) > > 2 - do one pass of iterable/non-postcopiable devices before we start the > > package; > > I'm just writing a patch to try that; I'll send it to you to let > > you try once I get it to not-break normal migration. > > > > Hi Bharata, > Can you try the patch below and let me know if it solves the problem; > if it doesn't, I'd be interested to know when the HTAB routines get > called in the precopy/postcopy phases. > > Dave > > >From 0f965d4dec7b188aec5324c3350704f993517cc8 Mon Sep 17 00:00:00 2001 > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > Date: Fri, 6 Nov 2015 12:06:16 +0000 > Subject: [PATCH] Finish non-postcopiable iterative devices before package > > Where we have iterable, but non-postcopiable devices (e.g. htab > or block migration), complete them before forming the 'package' > but with the CPUs stopped. This stops them filling up the package.
That helps and the migration suceeds now when I switch to postcopy immediately after starting the migration. However after postcopy migration, when I attempt to start an incoming instance again to migrate the guest back, I see this failure: qemu-system-ppc64: cannot set up guest memory 'ppc_spapr.ram': Cannot allocate memory The same doesn't happen with normal migration. I will see if I can debug this more tomorrow. Regards, Bharata.