On 03/27/2016 10:16 PM, Jitendra Kolhe wrote: > While measuring live migration performance for qemu/kvm guest, it > was observed that the qemu doesn’t maintain any intelligence for the > guest ram pages which are released by the guest balloon driver and > treat such pages as any other normal guest ram pages. This has direct > impact on overall migration time for the guest which has released > (ballooned out) memory to the host. > > In case of large systems, where we can configure large guests with 1TB > and with considerable amount of memory release by balloon driver to the, > host the migration time gets worse.
s/the, host/the host,/ > > The optimization gets temporarily disabled, if the balloon operation is s/disabled,/disabled/ > in progress. Since the optimization skips scanning and migrating control > information for ballooned out pages, we might skip guest ram pages in > cases where the guest balloon driver has freed the ram page to the guest > but not yet informed the host/qemu about the ram page > (VIRTIO_BALLOON_F_MUST_TELL_HOST). In such case with optimization, we > might skip migrating ram pages which the guest is using. Since this > problem is specific to balloon leak, we can restrict balloon operation in > progress check to only balloon leak operation in progress check. > > The optimization also get permanently disabled (for all subsequent s/get/gets/ > migrations) in case any of the migration uses postcopy capability. In case > of postcopy the balloon bitmap would be required to send after vm_stop, > which has significant impact on the downtime. Moreover, the applications > in the guest space won’t be actually faulting on the ram pages which are > already ballooned out, the proposed optimization will not show any > improvement in migration time during postcopy. > > Signed-off-by: Jitendra Kolhe <jitendra.ko...@hpe.com> > --- > Changed in v2: > - Resolved compilation issue for qemu-user binaries in exec.c > - Localize balloon bitmap test to save_zero_page(). > - Updated version string for newly added migration capability to 2.7. > - Made minor modifications to patch commit text. I'll leave the technical review to others. > +++ b/qapi-schema.json > @@ -544,11 +544,14 @@ > # been migrated, pulling the remaining pages along as needed. NOTE: > If > # the migration fails during postcopy the VM will fail. (since 2.6) > # > +# @skip-balloon: Skip scanning ram pages released by virtio-balloon driver. > +# (since 2.7) > +# > # Since: 1.2 > ## > { 'enum': 'MigrationCapability', > 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks', > - 'compress', 'events', 'postcopy-ram'] } > + 'compress', 'events', 'postcopy-ram', 'skip-balloon'] } Does this flag make sense to always have enabled (in which case we don't need it as a flag), or are there cases where we'd explicitly want to disable it? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature