On Thu Apr 10, 2025 at 10:17 AM BST, Andrew Cooper wrote: > On 10/04/2025 1:09 am, Jason Andryuk wrote: >> On 2025-04-09 13:01, Andrew Cooper wrote: >>> On 09/04/2025 5:36 pm, Andrew Cooper wrote: >>>> Various bits of cleanup, and support for arm64 Linux builds. >>>> >>>> Run using the new Linux 6.6.86 on (most) x86, and ARM64: >>>> >>>> https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/1760667411 >>> >>> Lovely, Linux 6.6.86 is broken for x86 PVH. It triple faults very >>> early on. >>> >>> Sample log: >>> https://gitlab.com/xen-project/hardware/xen-staging/-/jobs/9673797450 >>> >>> I guess we'll have to stay on 6.6.56 for now. (Only affects the final >>> patch.) >> >> This is an AMD system: >> >> (XEN) [ 2.577549] d0v0 Triple fault - invoking HVM shutdown action 1 >> (XEN) [ 2.577557] RIP: 0008:[<0000000001f851d4>] >> >> The instruction: >> ffffffff81f851d4: 0f 01 c1 vmcall >> >> vmcall is the Intel instruction, and vmmcall is the AMD one, so CPU >> detection is malfunctioning. >> >> (Early PVH is running identity mapped, so it's offset from >> ffffffff80000000) >> >> There are no debug symbols in the vmlinux I extracted from the bzImage >> from gitlab, but I can repro locally with on 6.6.86. It's unclear to >> me why it's failing. >> >> Trying: >> diff --git i/arch/x86/xen/enlighten.c w/arch/x86/xen/enlighten.c >> index 0219f1c90202..fb4ad7fe3e34 100644 >> --- i/arch/x86/xen/enlighten.c >> +++ w/arch/x86/xen/enlighten.c >> @@ -123,11 +123,10 @@ noinstr void *__xen_hypercall_setfunc(void) >> if (!boot_cpu_has(X86_FEATURE_CPUID)) >> xen_get_vendor(); >> >> - if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD || >> - boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) >> - func = xen_hypercall_amd; >> - else >> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ) >> func = xen_hypercall_intel; >> + else >> + func = xen_hypercall_amd; >> >> static_call_update_early(xen_hypercall, func); >> >> But it still calls xen_hypercall_intel(). So maybe x86_vendor isn't >> getting set and ends up as 0 (X86_VENDOR_INTEL)? >> >> That's as far as I got here. >> >> Different but related, on mainline master, I also get a fail in >> vmcall. There, I see in the disassembly that >> __xen_hypercall_setfunc()'s calls to xen_get_vendor() is gone. >> xen_get_vendor() seems to have been DCE-ed. There is some new code >> that hardcodes features - "x86/cpufeatures: Add {REQUIRED,DISABLED} >> feature configs" - which may be responsible. > > 6.6.74 is broken too. (That's the revision that the ARM tests want). > So it broke somewhere between .56 and .74 which narrows the bisect a little. > > https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/1761323774 > > In Gitlab, both AMD and Intel are failing in roughly the same way. > > ~Andrew
I've bisected the tags and it was was introduced somewhere between the v6.6.66 and the v6.6.67 tags. The hypercall page was removed very shortly before v6.6.67 was tagged, so I have a nagging suspicion... Cheers, Alejandro