Re: [Qemu-devel] [PATCH v5 00/12] Dirty bitmaps migration

John Snow Tue, 02 Jun 2015 15:19:01 -0700


On 05/28/2015 04:56 PM, Denis V. Lunev wrote:
> On 28/05/15 23:09, John Snow wrote:
>>
>> On 05/26/2015 10:51 AM, Denis V. Lunev wrote:
>>> On 26/05/15 17:48, Denis V. Lunev wrote:
>>>> On 21/05/15 19:44, John Snow wrote:
>>>>> On 05/21/2015 09:57 AM, Denis V. Lunev wrote:
>>>>>> On 21/05/15 16:51, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> Hi all.
>>>>>>>
>>>>>>> Hmm. There is an interesting suggestion from Denis Lunev (in CC)
>>>>>>> about
>>>>>>> how to drop meta bitmaps and make things easer.
>>>>>>>
>>>>>>> method:
>>>>>>>
>>>>>>>> start migration
>>>>>>> disk and memory are migrated, but not dirty bitmaps.
>>>>>>>> stop vm
>>>>>>> create all necessary bitmaps in destination vm (empty, but with same
>>>>>>> names and granularities and enabled flag)
>>>>>>>> start destination vm
>>>>>>> empty bitmaps are tracking now
>>>>>>>> start migrating dirty bitmaps. merge them to corresponding bitmaps
>>>>>>> in destination
>>>>>>> while bitmaps are migrating, they should be in some kind of
>>>>>>> 'inconsistent' state.
>>>>>>> so, we can't start backup or other migration while bitmaps are
>>>>>>> migrating, but vm is already _running_ on destination.
>>>>>>>
>>>>>>> what do you think about it?
>>>>>>>
>>>>>> the description is a bit incorrect
>>>>>>
>>>>>> - start migration process, perform memory and disk migration
>>>>>>      as usual. VM is still executed at source
>>>>>> - start VM on target. VM on source should be on pause as usual,
>>>>>>      do not finish migration process. Running VM on target "writes"
>>>>>>      normally setting dirty bits as usual
>>>>>> - copy active dirty bitmaps from source to target. This is safe
>>>>>>      as VM on source is not running
>>>>>> - "OR" copied bitmaps with ones running on target
>>>>>> - finish migration process (stop source VM).
>>>>>>
>>>>>> Downtime will not be increased due to dirty bitmaps with this
>>>>>> approach, migration process is very simple - plain data copy.
>>>>>>
>>>>>> Regards,
>>>>>>       Den
>>>>>>
>>>>> I was actually just discussing the live migration approach a little
>>>>> bit
>>>>> ago with Stefan, trying to decide on the "right" packet format (The
>>>>> only
>>>>> two patches I haven't ACKed yet are ones in which we need to choose a
>>>>> send size) and we decided that 1KiB chunk sends would be
>>>>> appropriate for
>>>>> live migration.
>>>>>
>>>>> I think I'm okay with that method, but obviously this approach
>>>>> outlined
>>>>> here would also work very well and would avoid meta bitmaps, chunk
>>>>> sizes, migration tuning, convergence questions, etc etc etc.
>>>>>
>>>>> You'd need to add a new status to the bitmap on the target (maybe
>>>>> "INCOMPLETE" or "MIGRATING") that prevents it from being used for a
>>>>> backup operation without preventing it from recording new writes.
>>>>>
>>>>> My only concern is how easy it will be to work this into the migration
>>>>> workflow.
>>>>>
>>>>> It would require some sort of "post-migration" ternary phase, I
>>>>> suppose,
>>>>> for devices/data that can be transferred after the VM starts -- and I
>>>>> suspect we'll be the only use of that phase for now.
>>>>>
>>>>> David, what are your thoughts, here? Would you prefer Vladimir and I
>>>>> push forward on the live migration approach, or add a new post-hoc
>>>>> phase? This approach might be simpler on the block layer, but I
>>>>> would be
>>>>> rather upset if he scrapped his entire series for the second time for
>>>>> another approach that also didn't get accepted.
>>>>>
>>>>> --js
>>>> hmmm.... It looks like we should proceed with this to fit 2.4 dates.
>>>> There is not much interest at the moment. I think that we could
>>>> implement this later in 2.5 etc...
>>>>
>>>> Regards,
>>>>      Den
>>> oops. I have written something strange. Anyway, I think that for
>>> now we should proceed with this patchset to fit QEMU 2.4 dates.
>>> The implementation with additional stage (my proposal) could be
>>> added later, f.e. in 2.5 as I do not see much interest from migration
>>> gurus.
>>>
>>> In this case the review will take a ... lot of time.
>>>
>>> Regards,
>>>      Den
>>>
>> That sounds good to me. I think this solution is workable for 2.4, and
>> we can begin working on a post-migration phase for the future to help
>> simplify our cases a lot.
>>
>> I have been out sick much of this week, so apologies in my lack of
>> fervor getting this series upstream recently.
>>
>> --js
> no prob :)


Had a chat with Stefan about this approach and apparently that's what
the postcopy migration patches on-list are all about.

Stefan brought up the point of post-hoc reliability: It's possible to
transfer control to the new VM and then lose your link, making migration
completion impossible. Adding a post-copy phase to our existing live
migration is a non-starter, because it introduces unfairly this
unreliability to the existing system.

However, we can make this idea work for migrations started via the
post-copy mechanism, because the entire migration already carries that
known risk of completion failure.

It seems like the likely outcome though is that migrations will be able
to be completed with either mechanism in the future: either up-front
migration or post-copy migration. In that light, it seems we won't be
able to fully rid ourselves of the meta_bitmap idea, making the
post-copy idea here not too useful in culling our complexity, since
we'll have to support the current standard live migration anyway.

So I have reviewed the current set of patches under the assumption that
it seems like the right way to go for 2.4 and beyond.

Thank you!
--js

Re: [Qemu-devel] [PATCH v5 00/12] Dirty bitmaps migration

Reply via email to