* Stefan Hajnoczi (stefa...@gmail.com) wrote: > On Tue, Feb 11, 2014 at 07:30:54PM +0100, Stefan Priebe wrote: > > Am 11.02.2014 16:44, schrieb Stefan Hajnoczi: > > >On Tue, Feb 11, 2014 at 3:54 PM, Stefan Priebe - Profihost AG > > ><s.pri...@profihost.ag> wrote: > > >>in the past (Qemu 1.5) a migration failed if there was not enogh memory > > >>on the target host available directly at the beginning. > > >> > > >>Now with Qemu 1.7 i've seen succeeded migrations but the kernel OOM > > >>memory killer killing qemu processes. So the migration seems to takes > > >>place without having anough memory on the target machine? > > > > > >How much memory is the guest configured with? How much memory does > > >the host have? > > > > Guest: 48GB > > Host: 192GB > > > > >I wonder if there are zero pages that can be migrated almost "for > > >free" and the destination host doesn't touch. When they are touched > > >for the first time after migration handover, they need to be allocated > > >on the destination host. This can lead to OOM if you overcommitted > > >memory. > > > > In the past the migration failed immediatly with exit code 255. > > > > >Can you reproduce the OOM reliably? It should be possible to debug it > > >and figure out whether it's just bad luck or a true regression. > > > > So there is no known patch changing this behaviour? > > > > What is about those? > > fc1c4a5d32e15a4c40c47945da85ef9c1e0c1b54 > > 211ea74022f51164a7729030b28eec90b6c99a08 > > f1c72795af573b24a7da5eb52375c9aba8a37972 > > Yes, that's what I was referring to when I mentioned zero pages. > > The problem might just be that the destination host didn't have enough > free memory. Migration succeeded due to memory overcommit on the host, > but quickly ran out of memory after handover. The quick answer there is > to reconsider your overcommitting memory and also checking memory > availability before live migrating.
When you said 'in the past (Qemu 1.5)' is that actual 1.5 release? I ask because a bit of bisecting leads me to 7dda5dc82a776a39a799 'migration: initialize RAM to zero' (16th April 2013 slightly before 1.5.0 time) - Although I think it's effect maybe just to make these previous changes have the effect they were intended to. So if the behaviour you're seeing is between 1.5/1.7 then it's something else I think one of the ways to think about it is that previously you could start a guest on a host relying on overcommit (although it might OOM) but you were unlikely to be able to migrate it in to a host with overcommit because it would write all it's 0 pages. However, if you're seeing the difference between a 1.5 release and 1.7 then maybe it's something more subtle. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK