Re: [Qemu-devel] [POC] colo-proxy in qemu

Gonglei Thu, 30 Jul 2015 05:12:41 -0700

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:
> * Jason Wang (jasow...@redhat.com) wrote:
>>
>>
>> On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
>>> * Dong, Eddie (eddie.d...@intel.com) wrote:
>>>>>> A question here, the packet comparing may be very tricky. For example,
>>>>>> some protocol use random data to generate unpredictable id or
>>>>>> something else. One example is ipv6_select_ident() in Linux. So COLO
>>>>>> needs a mechanism to make sure PVM and SVM can generate same random
>>>>> data?
>>>>> Good question, the random data connection is a big problem for COLO. At
>>>>> present, it will trigger checkpoint processing because of the different 
>>>>> random
>>>>> data.
>>>>> I don't think any mechanisms can assure two different machines generate 
>>>>> the
>>>>> same random data. If you have any ideas, pls tell us :)
>>>>>
>>>>> Frequent checkpoint can handle this scenario, but maybe will cause the
>>>>> performance poor. :(
>>>>>
>>>> The assumption is that, after VM checkpoint, SVM and PVM have identical 
>>>> internal state, so the pattern used to generate random data has high 
>>>> possibility to generate identical data at short time, at least...
>>> They do diverge pretty quickly though; I have simple examples which
>>> reliably cause a checkpoint because of simple randomness in applications.
>>>
>>> Dave
>>>
>>
>> And it will become even worse if hwrng is used in guest.
> 
> Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
> once established, tends to work well without triggering checkpoints;
> and static web pages also work well.  Examples of things that do cause
> more checkpoints are, displaying guest statistics (e.g. running top
> in that ssh) which is timing dependent, and dynamically generated
> web pages that include a unique ID (bugzilla's password reset link in
> it's front page was a fun one), I think also establishing
> new encrypted connections cause the same randomness.
> 
> However, it's worth remembering that COLO is trying to reduce the
> number of checkpoints compared to a simple checkpointing world
> which would be aiming to do a checkpoint ~100 times a second,
> and for compute bound workloads, or ones that don't expose
> the randomness that much, it can get checkpoints of a few seconds
> in length which greatly reduces the overhead.
>


Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.

Regards,
-Gonglei

Re: [Qemu-devel] [POC] colo-proxy in qemu

Reply via email to