"Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > Hi, > I'm trying to understand why migration_bitmap_extend is correct/safe; > If I understand correctly, you're arguing that: > > 1) the migration_bitmap_mutex around the extend, stops any sync's happening > and so no new bits will be set during the extend. > > 2) If migration sends a page and clears a bitmap entry, it doesn't > matter if we lose the 'clear' because we're copying it as > we extend it, because losing the clear just means the page > gets resent, and so the data is OK. > > However, doesn't (2) mean that migration_dirty_pages might be wrong? > If a page was sent, the bit cleared, and migration_dirty_pages decremented, > then if we copy over that bitmap and 'set' that bit again then > migration_dirty_pages > is too small; that means that either migration would finish too early, > or more likely, migration_dirty_pages would wrap-around -ve and > never finish. > > Is there a reason it's really safe?
No. It is reasonably safe. Various values of reasonably. migration_dirty_pages should never arrive at values near zero. Because we move to the completion stage way before it gets a value near zero. (We could have very, very bad luck, as in it is not safe). Now, do we really care if migration_dirty_pages is exact? Not really, we just use it to calculate if we should start the throotle or not. That only test that each 1 second, so if we have written a couple of pages that we are not accounting for, things should be reasonably safe. Once told that, I don't know why we didn't catch that problem during review (yes, I am guilty here). Not sure how to really fix it, thought. I think that the problem is more theoretical than real, but .... Thanks, Juan. > > Dave > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK