On Fri, 11 Apr 2025 at 16:28, Jason Andryuk <jason.andr...@amd.com> wrote:
>
>
>
> On 2025-04-11 07:35, Ard Biesheuvel wrote:
> > On Thu, 10 Apr 2025 at 23:49, Andrew Cooper <andrew.coop...@citrix.com> 
> > wrote:
> >>
> >> On 10/04/2025 8:50 pm, Jason Andryuk wrote:
> >>> A Xen PVH dom0 on an AMD processor triple faults early in boot on
> >>> 6.6.86.  CPU detection appears to fail, as the faulting instruction is
> >>> vmcall in xen_hypercall_intel() and not vmmcall in xen_hypercall_amd().
> >>>
> >>> Detection fails because __xen_hypercall_setfunc() returns the full
> >>> kernel mapped address of xen_hypercall_amd() or xen_hypercall_intel() -
> >>> e.g. 0xffffffff815b93f0.  But this is compared against the rip-relative
> >>> xen_hypercall_amd(%rip), which when running from identity mapping, is
> >>> only 0x015b93f0.
> >>>
> >>> Replace the rip-relative address with just loading the actual address to
> >>> restore the proper comparision.
> >>>
> >>> This only seems to affect PVH dom0 boot.  This is probably because the
> >>> XENMEM_memory_map hypercall is issued early on from the identity
> >>> mappings.  With a domU, the memory map is provided via hvm_start_info
> >>> and the hypercall is skipped.  The domU is probably running from the
> >>> kernel high mapping when it issues hypercalls.
> >>>
> >>> Signed-off-by: Jason Andryuk <jason.andr...@amd.com>
> >>> ---
> >>> I think this sort of address mismatch would be addresed by
> >>> e8fbc0d9cab6 ("x86/pvh: Call C code via the kernel virtual mapping")
> >>>
> >>> That could be backported instead, but it depends on a fair number of
> >>> patches.
> >>
> >> I've just spoken to Ard, and he thinks that it's standalone.  Should be
> >> ok to backport as a fix.
> >>
> >
> > I've tried building and booting 6.6.y with the patch applied - GS will
> > still be set to the 1:1 mapped address but that shouldn't matter,
> > given that it is only used for the stack canary, and we don't do
> > address comparisons on that afaik.
>
> Yes, it seems to work - I tested with dom0 and it booted.  I removed the
> use of phys_base - the diff is included below.  Does that match what you
> did?
>

The stable tree maintainers generally prefer the backports to be as
close to the originals as possible, and given that phys_base is
guaranteed to be 0x0, you might as well keep the subtraction.

> >>> Not sure on how getting a patch just into 6.6 would work.  This patch
> >>> could go into upstream Linux though it's not strictly necessary when the
> >>> rip-relative address is a high address.
> >>
> >> Do we know which other trees are broken?  I only found 6.6 because I was
> >> messing around with other bits of CI that happen to use 6.6.
> >>
> >
> > I'd assume all trees that had the hypercall page removal patch
> > backported to them will be broken in the same way.
>
> Yes, I think so.  Looks like it went back to 5.10 but not to 5.4.
>
> Ard, I can submit the stable request unless you want to.
>

Please go ahead.

Reply via email to