On 2025-04-11 07:35, Ard Biesheuvel wrote:
On Thu, 10 Apr 2025 at 23:49, Andrew Cooper <andrew.coop...@citrix.com> wrote:

On 10/04/2025 8:50 pm, Jason Andryuk wrote:
A Xen PVH dom0 on an AMD processor triple faults early in boot on
6.6.86.  CPU detection appears to fail, as the faulting instruction is
vmcall in xen_hypercall_intel() and not vmmcall in xen_hypercall_amd().

Detection fails because __xen_hypercall_setfunc() returns the full
kernel mapped address of xen_hypercall_amd() or xen_hypercall_intel() -
e.g. 0xffffffff815b93f0.  But this is compared against the rip-relative
xen_hypercall_amd(%rip), which when running from identity mapping, is
only 0x015b93f0.

Replace the rip-relative address with just loading the actual address to
restore the proper comparision.

This only seems to affect PVH dom0 boot.  This is probably because the
XENMEM_memory_map hypercall is issued early on from the identity
mappings.  With a domU, the memory map is provided via hvm_start_info
and the hypercall is skipped.  The domU is probably running from the
kernel high mapping when it issues hypercalls.

Signed-off-by: Jason Andryuk <jason.andr...@amd.com>
---
I think this sort of address mismatch would be addresed by
e8fbc0d9cab6 ("x86/pvh: Call C code via the kernel virtual mapping")

That could be backported instead, but it depends on a fair number of
patches.

I've just spoken to Ard, and he thinks that it's standalone.  Should be
ok to backport as a fix.


I've tried building and booting 6.6.y with the patch applied - GS will
still be set to the 1:1 mapped address but that shouldn't matter,
given that it is only used for the stack canary, and we don't do
address comparisons on that afaik.

Yes, it seems to work - I tested with dom0 and it booted. I removed the use of phys_base - the diff is included below. Does that match what you did?

Not sure on how getting a patch just into 6.6 would work.  This patch
could go into upstream Linux though it's not strictly necessary when the
rip-relative address is a high address.

Do we know which other trees are broken?  I only found 6.6 because I was
messing around with other bits of CI that happen to use 6.6.


I'd assume all trees that had the hypercall page removal patch
backported to them will be broken in the same way.

Yes, I think so.  Looks like it went back to 5.10 but not to 5.4.

Ard, I can submit the stable request unless you want to.

Regards,
Jason

diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index c4365a05ab83..9bf4cc04f079 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -100,7 +100,11 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
        xor %edx, %edx
        wrmsr

-       call xen_prepare_pvh
+       /* Call xen_prepare_pvh() via the kernel virtual mapping */
+       leaq xen_prepare_pvh(%rip), %rax
+       addq $__START_KERNEL_map, %rax
+       ANNOTATE_RETPOLINE_SAFE
+       call *%rax

        /* startup_64 expects boot_params in %rsi. */
        mov $_pa(pvh_bootparams), %rsi


Reply via email to