On Thu, Mar 24, 2016 at 03:53:25PM +0000, Li, Liang Z wrote:
> > > > > Not very complex, we can implement like this:
> > > > >
> > > > > 1. Set all the bits in the migration_bitmap_rcu->bmap to 1 2.
> > > > > Clear all the bits in ram_list.
> > > > > dirty_memory[DIRTY_MEMORY_MIGRATION]
> > > > > 3. Send the get_free_page_bitmap request 4. Start to send pages to
> > > > > destination and check if the free_page_bitmap is ready
> > > > >     if (is_ready) {
> > > > >           filter out the free pages from  migration_bitmap_rcu->bmap;
> > > > >           migration_bitmap_sync();
> > > > >     }
> > > > >      continue until live migration complete.
> > > > >
> > > > >
> > > > > Is that right?
> > > >
> > > > The order I'm trying to understand is something like:
> > > >
> > > >     a) Send the get_free_page_bitmap request
> > > >     b) Start sending pages
> > > >     c) Reach the end of memory
> > > >       [ is_ready is false - guest hasn't made free map yet ]
> > > >     d) normal migration_bitmap_sync() at end of first pass
> > > >     e) Carry on sending dirty pages
> > > >     f) is_ready is true
> > > >       f.1) filter out free pages?
> > > >       f.2) migration_bitmap_sync()
> > > >
> > > > It's f.1 I'm worried about.  If the guest started generating the
> > > > free bitmap before (d), then a page marked as 'free' in f.1 might
> > > > have become dirty before (d) and so (f.2) doesn't set the dirty
> > > > again, and so we can't filter out pages in f.1.
> > > >
> > >
> > > As you described, the order is incorrect.
> > >
> > > Liang
> > 
> > 
> > So to make it safe, what is required is to make sure no free list us 
> > outstanding
> > before calling migration_bitmap_sync.
> > 
> > If one is outstanding, filter out pages before calling 
> > migration_bitmap_sync.
> > 
> > Of course, if we just do it like we normally do with migration, then by the
> > time we call migration_bitmap_sync dirty bitmap is completely empty, so
> > there won't be anything to filter out.
> > 
> > One way to address this is call migration_bitmap_sync in the IO handler,
> > while VCPU is stopped, then make sure to filter out pages before the next
> > migration_bitmap_sync.
> > 
> > Another is to start filtering out pages upon IO handler, but make sure to 
> > flush
> > the queue before calling migration_bitmap_sync.
> > 
> 
> It's really complex, maybe we should switch to a simple start,  just skip the 
> free page in
> the ram bulk stage and make it asynchronous?
> 
> Liang

You mean like your patches do? No, blocking bulk migration until guest
response is basically a non-starter.

-- 
MST

Reply via email to