ram: Optimize for virtio-mem via RamDiscardManager

Peter Xu Thu, 29 Jul 2021 12:33:32 -0700

On Thu, Jul 29, 2021 at 06:19:31PM +0200, David Hildenbrand wrote:
> On 29.07.21 18:12, Peter Xu wrote:
> > On Thu, Jul 29, 2021 at 10:14:47AM +0200, David Hildenbrand wrote:
> > > > > > > > The thing is I still think this extra operation during sync() 
> > > > > > > > can be ignored by
> > > > > > > > simply clear dirty log during bitmap init, then.. why not? :)
> > > > > > > 
> > > > > > > I guess clearing the dirty log (especially in KVM) might be more 
> > > > > > > expensive.
> > > > > > 
> > > > > > If we send one ioctl per cb that'll be expensive for sure.  I think 
> > > > > > it'll be
> > > > > > fine if we send one clear ioctl to kvm, summarizing the whole 
> > > > > > bitmap to clear.
> > > > > > 
> > > > > > The other thing is imho having overhead during bitmap init is 
> > > > > > always better
> > > > > > than having that during sync(). :)
> > > > > 
> > > > > Oh, right, so you're saying, after we set the dirty bmap to all ones 
> > > > > and
> > > > > excluded the discarded parts, setting the respective bits to 0, we 
> > > > > simply
> > > > > issue clearing of the whole area?
> > > > > 
> > > > > For now I assumed we would have to clear per cb.
> > > > 
> > > > Hmm when I replied I thought we can pass in a bitmap to ->log_clear() 
> > > > but I
> > > > just remembered memory API actually hides the bitmap interface..
> > > > 
> > > > Reset the whole region works, but it'll slow down migration starts, more
> > > > importantly that'll be with mmu write lock so we will lose most 
> > > > clear-log
> > > > benefit for the initial round of migration and stuck the guest #pf at 
> > > > the
> > > > meantime...
> > > > 
> > > > Let's try do that in cb()s as you mentioned; I think that'll still be 
> > > > okay,
> > > > because so far the clear log block size is much larger (1gb), 1tb is 
> > > > worst case
> > > > 1000 ioctls during bitmap init, slightly better than 250k calls during 
> > > > sync(),
> > > > maybe? :)
> > > 
> > > Just to get it right, what you propose is calling
> > > migration_clear_memory_region_dirty_bitmap_range() from each cb().
> > 
> > Right.  We can provide a more complicated memory api for passing in bitmap 
> > but
> > I think that can be an overkill and tricky.
> > 
> > > Due to the clear_bmap, we will end up clearing each chunk (e.g., 1GB) at 
> > > most
> > > once.
> > > 
> > > But if our layout is fragmented, we can actually end up clearing all 
> > > chunks
> > > (1024 ioctls for 1TB), resulting in a slower migration start.
> > > 
> > > Any gut feeling how much slower migration start could be with largish 
> > > (e.g.,
> > > 1 TiB) regions?
> > 
> > I had a vague memory of KVM_GET_DIRTY_LOG that I used to measure which took
> > ~10ms for 1g guest mem, supposing that's mostly used to protect the pages or
> > clearing dirties in the EPT pgtables.  Then the worst case is ~1 second for
> > 1tb.
> > 
> > But note that it's still during setup phase, so we should expect to see a
> > somehow large setup time and longer period that migration stays in SETUP 
> > state,
> > but I think it's fine.  Reasons:
> > 
> >    - We don't care too much about guest dirtying pages during the setup 
> > process
> >      because we haven't migrated anything yet, meanwhile we should not 
> > block any
> >      other thread either (e.g., we don't hold BQL).
> > 
> >    - We don't block guest execution too.  Unlike KVM_GET_DIRTY_LOG without 
> > CLEAR
> >      we won't hold the mmu lock for a huge long time but do it only in 1g 
> > chunk,
> >      so guest page faults can still be serviced.  It'll be affected somehow
> >      since we'll still run with the mmu write lock critical sections for 
> > each
> >      single ioctl(), but we do that for 1gb each time so we frequently 
> > yield it.
> > 
> 
> Please note that we are holding the iothread lock while setting up the
> bitmaps + syncing the dirty log. I'll have to make sure that that code runs
> outside of the BQL, otherwise we'll block guest execution.


Oh right.

> 
> In the meantime I adjusted the code but it does the clearing under the
> iothread lock, which should not be what we want ... I'll have a look.

Thanks; if it takes more changes than expected we can still start from simple,
IMHO, by taking bql and timely yield it.

At the meantime, I found two things in ram_init_bitmaps() that I'm not sure we
need them of not:

  1. Do we need WITH_RCU_READ_LOCK_GUARD() if with both bql and ramlist lock?
     (small question)

  2. Do we need migration_bitmap_sync_precopy() even if dirty bmap is all 1's?
     (bigger question)

I feel like those can be dropped.  Dave/Juan?

-- 
Peter Xu

Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager

Reply via email to