* Wen Congyang (we...@cn.fujitsu.com) wrote: > On 11/04/2015 05:05 PM, Dr. David Alan Gilbert wrote: > > * Wen Congyang (we...@cn.fujitsu.com) wrote: > >> On 11/03/2015 09:47 PM, Dr. David Alan Gilbert wrote: > >>> * Juan Quintela (quint...@redhat.com) wrote: > >>>> "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > >>>>> Hi, > >>>>> I'm trying to understand why migration_bitmap_extend is correct/safe; > >>>>> If I understand correctly, you're arguing that: > >>>>> > >>>>> 1) the migration_bitmap_mutex around the extend, stops any sync's > >>>>> happening > >>>>> and so no new bits will be set during the extend. > >>>>> > >>>>> 2) If migration sends a page and clears a bitmap entry, it doesn't > >>>>> matter if we lose the 'clear' because we're copying it as > >>>>> we extend it, because losing the clear just means the page > >>>>> gets resent, and so the data is OK. > >>>>> > >>>>> However, doesn't (2) mean that migration_dirty_pages might be wrong? > >>>>> If a page was sent, the bit cleared, and migration_dirty_pages > >>>>> decremented, > >>>>> then if we copy over that bitmap and 'set' that bit again then > >>>>> migration_dirty_pages > >>>>> is too small; that means that either migration would finish too early, > >>>>> or more likely, migration_dirty_pages would wrap-around -ve and > >>>>> never finish. > >>>>> > >>>>> Is there a reason it's really safe? > >>>> > >>>> No. It is reasonably safe. Various values of reasonably. > >>>> > >>>> migration_dirty_pages should never arrive at values near zero. Because > >>>> we move to the completion stage way before it gets a value near zero. > >>>> (We could have very, very bad luck, as in it is not safe). > >>> > >>> That's only true if we hit the qemu_file_rate_limit() in ram_save_iterate; > >>> if we don't hit the rate limit (e.g. because we're CPU or network limited > >>> to slower than the set limit) then I think ram_save_iterate will go all > >>> the > >>> way to sending every page; if that happens it'll go once more > >>> around the main migration loop, and call the pending routine, and now get > >>> a -ve (very +ve) number of pending pages, so continuously do > >>> ram_save_iterate > >>> again. > >>> > >>> We've had that type of bug before when we messed up the dirty-pages > >>> calculation > >>> during hotplug. > >> > >> IIUC, migration_bitmap_extend() is called when migration is running, and > >> we hotplug > >> a device. > >> > >> In this case, I think we hold the iothread mutex when > >> migration_bitmap_extend() is called. > >> > >> ram_save_complete() is also protected by the iothread mutex. > >> > >> So if migration_bitmap_extend() is called, the migration thread may be > >> blocked in > >> migration_completion() and wait it. qemu_savevm_state_complete() will be > >> called after > >> migration_completion() returns. > > > > But I don't think ram_save_iterate is protected by that lock, and my concern > > is that the dirty-pages calculation is wrong during the iteration phase, > > and then > > the iteration phase will never exit and never try and get to > > ram_save_complete. > > Yes, the dirty-pages may be wrong. But it is smaller, not larger than the > exact value. > Why will the iteration phase never exit?
Imagine that migration_dirty_pages is slightly too small and we enter ram_save_iterate; ram_save_iterate now sends *all* it's pages, it decrements migration_dirty_pages for every page sent. At the end of ram_save_iterate, migration_dirty_pages would be negative. But migration_dirty_pages is *u*int64_t; so we exit ram_save_iterate, go around the main migration_thread loop again and call qemu_savevm_state_pending, and it returns a very large number (because it's actually a negative number), so we keep going around the loop, because it never gets smaller. Dave > > Thanks > Wen Congyang > > > > > Dave > > > >> > >> Thanks > >> Wen Congyang > >> > >>> > >>>> Now, do we really care if migration_dirty_pages is exact? Not really, > >>>> we just use it to calculate if we should start the throotle or not. > >>>> That only test that each 1 second, so if we have written a couple of > >>>> pages that we are not accounting for, things should be reasonably safe. > >>>> > >>>> Once told that, I don't know why we didn't catch that problem during > >>>> review (yes, I am guilty here). Not sure how to really fix it, > >>>> thought. I think that the problem is more theoretical than real, but > >>> > >>> Dave > >>> > >>>> .... > >>>> > >>>> Thanks, Juan. > >>>> > >>>>> > >>>>> Dave > >>>>> > >>>>> -- > >>>>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > >>> -- > >>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > >>> > >>> . > >>> > >> > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > . > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK