Re: [Xen-devel] Design session report: Live-Updating Xen

Jan Beulich Thu, 18 Jul 2019 02:16:20 -0700

On 17.07.2019 20:40, Andrew Cooper wrote:
> On 17/07/2019 14:02, Jan Beulich wrote:
>> On 17.07.2019 13:26, Andrew Cooper wrote:
>>> We do not want to be grovelling around in the old Xen's datastructures,
>>> because that adds a binary A=>B translation which is
>>> per-old-version-of-xen, meaning that you need a custom build of each
>>> target Xen which depends on the currently-running Xen, or have to
>>> maintain a matrix of old versions which will be dependent on the local
>>> changes, and therefore not suitable for upstream.
>> Now the question is what alternative you would suggest. By you
>> saying "the pinned state lives in the migration stream", I assume
>> you mean to imply that Dom0 state should be handed from old to
>> new Xen via such a stream (minus raw data page contents)?
> 
> Yes, and this in explicitly identified in the bullet point saying "We do
> only rely on domain state and no internal xen state".
> 
> In practice, it is going to be far more efficient to have Xen
> serialise/deserialise the domain register state etc, than to bounce it
> via hypercalls.  By the time you're doing that in Xen, adding dom0 as
> well is trivial.


So I must be missing some context here: How could hypercalls come into
the picture at all when it comes to "migrating" Dom0?

>>> The in-guest evtchn data structure will accumulate events just like a
>>> posted interrupt descriptor.  Real interrupts will queue in the LAPIC
>>> during the transition period.
>> Yes, that'll work as long as interrupts remain active from Xen's POV.
>> But if there's concern about a blackout period for HVM/PVH, then
>> surely there would also be such for PV.
> 
> The only fix for that is to reduce the length of the blackout period.
> We can't magically inject interrupts half way through the xen-to-xen
> transition, because we can't run vcpus at that point in time.

Hence David's proposal to "re-inject". We'd have to record them during
the blackout period, and inject once Dom0 is all set up again.

>>>> Re-using large data structures (or arrays thereof) may also turn out
>>>> useful in terms of latency until the new Xen actually becomes ready to
>>>> resume.
>>> When it comes to optimising the latency, there is a fair amount we might
>>> be able to do ahead of the critical region, but I still think this would
>>> be better done in terms of a "clean start" in the new Xen to reduce
>>> binary dependences.
>> Latency actually is only one aspect (albeit the larger the host, the more
>> relevant it is). Sufficient memory to have both old and new copies of the
>> data structures in place, plus the migration stream, is another. This
>> would especially become relevant when even DomU-s were to remain in
>> memory, rather than getting saved/restored.
> 
> But we're still talking about something which is on a multi-MB scale,
> rather than multi-GB scale.

On multi-TB systems frame_table[] is a multi-GB table. And with boot times
often scaling (roughly) with system size, live updating is (I guess) all
the more interesting on bigger systems.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Design session report: Live-Updating Xen

Reply via email to