Re: [Qemu-devel] Restoring bitmaps after failed/cancelled migration

Fam Zheng Wed, 16 May 2018 19:16:04 -0700

On Wed, 05/16 18:52, Vladimir Sementsov-Ogievskiy wrote:
> 16.05.2018 18:32, Kevin Wolf wrote:
> > Am 16.05.2018 um 17:10 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 16.05.2018 15:47, Kevin Wolf wrote:
> > > > Am 14.05.2018 um 12:09 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > > 14.05.2018 09:41, Fam Zheng wrote:
> > > > > > On Wed, 04/18 17:00, Vladimir Sementsov-Ogievskiy wrote:
> > > > > > > Is it possible, that target will change the disk, and then we 
> > > > > > > return control
> > > > > > > to the source? In this case bitmaps will be invalid. So, should 
> > > > > > > not we drop
> > > > > > > all the bitmaps on inactivate?
> > > > > > Yes, dropping all live bitmaps upon inactivate sounds reasonable. 
> > > > > > If the dst
> > > > > > fails to start, and we want to resume VM at src, we could 
> > > > > > (optionally?) reload
> > > > > > the persistent bitmaps, I guess.
> > > > > Reload from where? We didn't store them.
> > > > Maybe this just means that it turns out that not storing them was a bad
> > > > idea?
> > > > 
> > > > What was the motivation for not storing the bitmap? The additional
> > > > downtime? Is it really that bad, though? Bitmaps should be fairly small
> > > > for the usual image sizes and writing them out should be quick.
> > > What are usual ones? A bitmap of standard granularity of 64k for 16Tb disk
> > > is ~30mb. If we have several such bitmaps it may be significant downtime.
> > We could have an in-memory bitmap that tracks which parts of the
> > persistent bitmap are dirty so that you don't have to write out the
> > whole 30 MB during the migration downtime, but can already flush most of
> > the persistent bitmap before the VM is stopped.
> > 
> > Kevin
> 
> Yes it looks possible. But how to control that downtime? Introduce migration
> state, with specific _pending function? However, it may be not necessary.
> 
> Anyway, I think we don't need to store it.
> 
> If we decided to resume source, bitmap is already in memory, why to reload
> it? If someone already killed source (which was in paused mode), it is
> inconsistent anyway and loss of dirty bitmap is not the worst possible
> problem.
> 
> So, finally, it looks safe enough, just to make bitmaps on source persistent
> again (or better, introduce another way to skip storing (may be with
> additional flag, so everybody will be happy), not dropping persistent flag).


This makes some sense to me. We'll then use the current persistent flag to
indicate the bitmap "is" a persistent one, instead of "should it be persisted".
They are apparently two different properties in the case discussed in this
thread.

> And, after source resume, we have one of the following situations:
> 
> 1. disk was not changed during migration, so, all is ok and we have bitmaps
> 2. disk was changed. bitmaps are inconsistent. But not only bitmaps, the
> whole vm state is inconsistent with it's disk. This case is a bug in
> management layer and it should never happen. And possibly, we need some
> separate way, to catch such cases.

Fam

Re: [Qemu-devel] Restoring bitmaps after failed/cancelled migration

Reply via email to