On 02.01.25 14:40, David Woodhouse wrote:
On Thu, 2025-01-02 at 14:38 +0100, Jürgen Groß wrote:
On 02.01.25 13:53, David Woodhouse wrote:
On Thu, 2025-01-02 at 13:07 +0100, Jürgen Groß wrote:
On 23.12.24 15:24, David Woodhouse wrote:
On Tue, 2024-12-17 at 12:18 +0000, Xen.org security team wrote:
                Xen Security Advisory CVE-2024-53241 / XSA-466
                                   version 3

            Xen hypercall page unsafe against speculative attacks

UPDATES IN VERSION 3
====================

Update of patch 5, public release.

Can't we even use the hypercall page early in boot? Surely we have to
know whether we're running on an Intel or AMD CPU before we get to the
point where we can enable any of the new control-flow integrity
support? Do we need to jump through those hoops do do that early
detection and setup?

The downside of this approach would be to have another variant to do
hypercalls. So you'd have to replace the variant being able to use AMD
or INTEL specific instructions with a function doing the hypercall via
the hypercall page.

You'd probably start with the hypercall function just jumping directly
into the temporary hypercall page during early boot, and then you'd
update them to use the natively prepared vmcall/vmmcall version later.

All the complexity of patching and CPU detection in early boot seems to
be somewhat gratuitous and even counter-productive given the change it
introduces to 64-bit latching.

And even if the 64-bit latch does happen when HVM_PARAM_CALLBACK_IRQ is
set, isn't that potentially a lot later in boot? Xen will be treating
this guest as 32-bit until then, so won't all the vcpu_info and
runstate structures be wrong even as the secondary CPUs are already up
and running?

What I don't get is why this latching isn't done when the shared info
page is mapped into the guest via the XENMAPSPACE_shared_info hypercall
or maybe additionally when VCPUOP_register_runstate_memory_area is being
used by the guest.

These are the earliest possible cases where the guest is able to access
this data.

Well, that's a great idea. Got a time machine? If you have, I have some
comments on the MSI→PIRQ mapping nonsense too... :)


I'm planning to send patches for Xen and the kernel to add CPUID feature
bits indicating which instruction to use. This will make life much easier.

Enabling the hypercall page is also one of the two points where Xen
will 'latch' that the guest is 64-bit, which affects the layout of the
shared_info, vcpu_info and runstate structures.

The other such latching point is when the guest sets
HVM_PARAM_CALLBACK_IRQ, and I *think* that should work in all
implementations of the Xen ABI (including QEMU/KVM and EC2). But would
want to test.

But perhaps it wouldn't hurt for maximal compatibility for Linux to set
the hypercall page *anyway*, even if Linux doesn't then use it — or
only uses it during early boot?

I'm seeing potential problems with that approach when someone is using
an out-of-tree module doing hypercalls.

With having the hypercall page present such a module would add a way to do
speculative attacks, while deleting the hypercall page would result in a
failure trying to load such a module.

Is that a response to the original patch series, or to my suggestion?

If we temporarily ask Xen to populate a hypercall page which is used
during early boot (or even if it's *not* used, and only used to make
sure Xen latches 64-bit mode early)... I don't see why that makes any
difference to modules. I wasn't suggesting we keep it around and
*export* it.

Ah, I didn't read your suggestion that way.

Still I believe using the hypercall page is not a good idea, especially as
we'd add a hard dependency on the ability to enable CFI in the kernel related
to the switch from the hypercall page to the new direct hypercall functions.

Are you suggesting that you're able to enable the CPU-specific CFI
protections before you even know whether it's an Intel or AMD CPU?

Not before that, but maybe rather soon afterwards. And the hypercall page
needs to be decommissioned before the next hypercall is happening. The question
is whether we have a hook in place to do that switch between cpu identification
and CFI enabling.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to