On Fri, 11 Apr 2025 at 16:28, Jason Andryuk <jason.andr...@amd.com> wrote: > > > > On 2025-04-11 07:35, Ard Biesheuvel wrote: > > On Thu, 10 Apr 2025 at 23:49, Andrew Cooper <andrew.coop...@citrix.com> > > wrote: > >> > >> On 10/04/2025 8:50 pm, Jason Andryuk wrote: > >>> A Xen PVH dom0 on an AMD processor triple faults early in boot on > >>> 6.6.86. CPU detection appears to fail, as the faulting instruction is > >>> vmcall in xen_hypercall_intel() and not vmmcall in xen_hypercall_amd(). > >>> > >>> Detection fails because __xen_hypercall_setfunc() returns the full > >>> kernel mapped address of xen_hypercall_amd() or xen_hypercall_intel() - > >>> e.g. 0xffffffff815b93f0. But this is compared against the rip-relative > >>> xen_hypercall_amd(%rip), which when running from identity mapping, is > >>> only 0x015b93f0. > >>> > >>> Replace the rip-relative address with just loading the actual address to > >>> restore the proper comparision. > >>> > >>> This only seems to affect PVH dom0 boot. This is probably because the > >>> XENMEM_memory_map hypercall is issued early on from the identity > >>> mappings. With a domU, the memory map is provided via hvm_start_info > >>> and the hypercall is skipped. The domU is probably running from the > >>> kernel high mapping when it issues hypercalls. > >>> > >>> Signed-off-by: Jason Andryuk <jason.andr...@amd.com> > >>> --- > >>> I think this sort of address mismatch would be addresed by > >>> e8fbc0d9cab6 ("x86/pvh: Call C code via the kernel virtual mapping") > >>> > >>> That could be backported instead, but it depends on a fair number of > >>> patches. > >> > >> I've just spoken to Ard, and he thinks that it's standalone. Should be > >> ok to backport as a fix. > >> > > > > I've tried building and booting 6.6.y with the patch applied - GS will > > still be set to the 1:1 mapped address but that shouldn't matter, > > given that it is only used for the stack canary, and we don't do > > address comparisons on that afaik. > > Yes, it seems to work - I tested with dom0 and it booted. I removed the > use of phys_base - the diff is included below. Does that match what you > did? >
The stable tree maintainers generally prefer the backports to be as close to the originals as possible, and given that phys_base is guaranteed to be 0x0, you might as well keep the subtraction. > >>> Not sure on how getting a patch just into 6.6 would work. This patch > >>> could go into upstream Linux though it's not strictly necessary when the > >>> rip-relative address is a high address. > >> > >> Do we know which other trees are broken? I only found 6.6 because I was > >> messing around with other bits of CI that happen to use 6.6. > >> > > > > I'd assume all trees that had the hypercall page removal patch > > backported to them will be broken in the same way. > > Yes, I think so. Looks like it went back to 5.10 but not to 5.4. > > Ard, I can submit the stable request unless you want to. > Please go ahead.