2015-11-27 2:19 GMT+08:00 Daniel P. Berrange <berra...@redhat.com>: > On Thu, Nov 26, 2015 at 05:39:04PM +0000, Daniel P. Berrange wrote: > > On Thu, Nov 26, 2015 at 11:55:31PM +0800, 少合冯 wrote: > > > 3. dynamically choose when to activate xbzrle compress for live > migration. > > > This is the best. > > > xbzrle really wants to be used if the network is not able to keep > up > > > with the dirtying rate of the guest RAM. > > > But how do I check the coming migration fit this situation? > > > > FWIW, if we decide we want compression support in Nova, I think that > > having the Nova libvirt driver dynamically decide when to use it is > > the only viable approach. Unfortunately the way the QEMU support > > is implemented makes it very hard to use, as QEMU forces you to decide > > to use it upfront, at a time when you don't have any useful information > > on which to make the decision :-( To be useful IMHO, we really need > > the ability to turn on compression on the fly for an existing active > > migration process. ie, we'd start migration off and let it run and > > only enable compression if we encounter problems with completion. > > Sadly we can't do this with QEMU as it stands today :-( > > > [Shaohe Feng] Add more guys working on kernel/hypervisor in our loop. Wonder whether there will be any good solutions to improve it in QEMU in future.
> > Oh and of course we still need to address the issue of RAM usage and > > communicating that need with the scheduler in order to avoid OOM > > scenarios due to large compression cache. > > > > I tend to feel that the QEMU compression code is currently broken by > > design and needs rework in QEMU before it can be pratically used in > > an autonomous fashion :-( > > Actually thinking about it, there's not really any significant > difference between Option 1 and Option 3. In both cases we want > a nova.conf setting live_migration_compression=on|off to control > whether we want to *permit* use of compression. > > The only real difference between 1 & 3 is whether migration has > compression enabled always, or whether we turn it on part way > though migration. > > So although option 3 is our desired approach (which we can't > actually implement due to QEMU limitations), option 1 could > be made fairly similar if we start off with a very small > compression cache size which would have the effect of more or > less disabling compression initially. > > We already have logic in the code for dynamically increasing > the max downtime value, which we could mirror here > > eg something like > > live_migration_compression=on|off > > - Whether to enable use of compression > > live_migration_compression_cache_ratio=0.8 > > - The maximum size of the compression cache relative to > the guest RAM size. Must be less than 1.0 > > live_migration_compression_cache_steps=10 > > - The number of steps to take to get from initial cache > size to the maximum cache size > > live_migration_compression_cache_delay=75 > > - The time delay in seconds between increases in cache > size > > > In the same way that we do with migration downtime, instead of > increasing cache size linearly, we'd increase it in ever larger > steps until we hit the maximum. So we'd start off fairly small > a few MB, and monitoring the cache hit rates, we'd increase it > periodically. If the number of steps configured and time delay > between steps are reasonably large, that would have the effect > that most migrations would have a fairly small cache and would > complete without needing much compression overhead. > > Doing this though, we still need a solution to the host OOM scenario > problem. We can't simply check free RAM at start of migration and > see if there's enough to spare for compression cache, as the schedular > can spawn a new guest on the compute host at any time, pushing us into > OOM. We really need some way to indicate that there is a (potentially > very large) extra RAM overhead for the guest during migration. > > ie if live_migration_compression_cache_ratio is 0.8 and we have a > 4 GB guest, we need to make sure the schedular knows that we are > potentially going to be using 7.2 GB of memory during migration > > [Shaohe Feng] These suggestions sounds good. Thank you, Daneil. Do we need to consider this factor: Seems, XBZRLE compress is executed after bulk stage. During the bulk stage, calculate an transfer rate. If the transfer rate bellow a certain threshold value, we can set a bigger cache size. > Regards, > Daniel > -- > |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ > :| > |: http://libvirt.org -o- http://virt-manager.org > :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ > :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc > :| > BR Shaohe Feng
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev