On 16/05/2019 17:17, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [Xen-devel] preparations for 4.11.2"):
>> In addition,
> Thanks.
>
> ==== wanting discussion: ====
>
>> 365aabb6e502 "tools/libxendevicemodel: add
>> xendevicemodel_modified_memory_bulk to map" is possibly a candidate, but
>> is also complicated by the stable SONAME.  It is perhaps easiest to
>> ignore, seeing as the issue has already gone unnoticed for 2 years.
> We would be bumping the minor version.  I think it is ABI compatible.
> So I am inclined to backport this one but I haven't done so yet.
>
>> 129025fe3093 "oxenstored: Don't re-open a xenctrl handle for every
>> domain introduction"
> Can you justify how this is a bugfix ?  It doesn't seem like backport
> material to me.

It was found from strace (while investigating an unrelated issue), but
given how many issues we've had in the past with {o,}xenstored exceeding
its FD limit, I'd still put it in the category of bugfix.

It balloons the worst-case FD requirements by as many concurrent domain
starts as the rest of dom0 can manage.

>
>> 7b20a865bc10 "tools/ocaml: Release the global lock before invoking block
>> syscalls"
> This *really* doesn't look like a bugfix, let alone a backport
> candidate !  Removing a lock for performance reasons !

Of course its a backport candidate, and it is a bugfix even if most of
the time it is only observed as a perf improvement.

The Ocaml FFI says "thou shalt not make a syscall holding this lock",
because while that lock is held, everything is single threaded.

IIRC, this particular issue lead to a partial outage of one of our HTTP
API endpoints.

>
>> c393b64dcee6 "tools/libxc: Fix issues with libxc and Xen having
>> different featureset lengths"
> The compatibility implications here are not clearly spelled out in the
> commit message.  AFAICT, after this commit, the effect is:
>   - new tools will work with old hypervisor
>   - old tools will not necessariloy work with old hypervisor
> I assume that we are talking here about old and new code with the same
> Xen version, eg as a result of a security fix.
>
> The previous behaviour, ie, what happens without this patch, is not
> entirely clear to me.

This was an unintended consequence of XSA-253 (Spectre/Meltdown) where
the length of the featureset did increase in a security fix.

In the period of time between installing updated dom0 userspace
packages, and rebooting into the new hypervisor, attempting to start a
guest results in libc heap corruption and an abort().

Because libxl doesn't used the partially-improved CPUID functionality
yet, it doesn't hit the second bug of incoming migrates getting
intermittently rejected due to 4/8 bytes of heap metadata being included
in the CPUID safety check.

>> 82855aba5bf9 "tools/libxc: Fix error handling in get_cpuid_domain_info()"
> This might break some callers, mightn't it ?  What callers ?  Or is
> there an argument that there aren't callers which will be broken ?

This was from the same bit of debugging as above, and ISTR caused some
error messages in higher callers to print junk instead of the real error.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to