On Thu Apr 10, 2025 at 10:17 AM BST, Andrew Cooper wrote:
> On 10/04/2025 1:09 am, Jason Andryuk wrote:
>> On 2025-04-09 13:01, Andrew Cooper wrote:
>>> On 09/04/2025 5:36 pm, Andrew Cooper wrote:
>>>> Various bits of cleanup, and support for arm64 Linux builds.
>>>>
>>>> Run using the new Linux 6.6.86 on (most) x86, and ARM64:
>>>>   
>>>> https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/1760667411
>>>
>>> Lovely, Linux 6.6.86 is broken for x86 PVH.  It triple faults very
>>> early on.
>>>
>>> Sample log:
>>> https://gitlab.com/xen-project/hardware/xen-staging/-/jobs/9673797450
>>>
>>> I guess we'll have to stay on 6.6.56 for now.  (Only affects the final
>>> patch.)
>>
>> This is an AMD system:
>>
>> (XEN) [    2.577549] d0v0 Triple fault - invoking HVM shutdown action 1
>> (XEN) [    2.577557] RIP:    0008:[<0000000001f851d4>]
>>
>> The instruction:
>> ffffffff81f851d4:       0f 01 c1                vmcall
>>
>> vmcall is the Intel instruction, and vmmcall is the AMD one, so CPU
>> detection is malfunctioning.
>>
>> (Early PVH is running identity mapped, so it's offset from
>> ffffffff80000000)
>>
>> There are no debug symbols in the vmlinux I extracted from the bzImage
>> from gitlab, but I can repro locally with on 6.6.86.  It's unclear to
>> me why it's failing.
>>
>> Trying:
>> diff --git i/arch/x86/xen/enlighten.c w/arch/x86/xen/enlighten.c
>> index 0219f1c90202..fb4ad7fe3e34 100644
>> --- i/arch/x86/xen/enlighten.c
>> +++ w/arch/x86/xen/enlighten.c
>> @@ -123,11 +123,10 @@ noinstr void *__xen_hypercall_setfunc(void)
>>         if (!boot_cpu_has(X86_FEATURE_CPUID))
>>                 xen_get_vendor();
>>
>> -       if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
>> -            boot_cpu_data.x86_vendor == X86_VENDOR_HYGON))
>> -               func = xen_hypercall_amd;
>> -       else
>> +       if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
>>                 func = xen_hypercall_intel;
>> +       else
>> +               func = xen_hypercall_amd;
>>
>>         static_call_update_early(xen_hypercall, func);
>>
>> But it still calls xen_hypercall_intel().  So maybe x86_vendor isn't
>> getting set and ends up as 0 (X86_VENDOR_INTEL)?
>>
>> That's as far as I got here.
>>
>> Different but related, on mainline master, I also get a fail in
>> vmcall. There, I see in the disassembly that
>> __xen_hypercall_setfunc()'s calls to xen_get_vendor() is gone. 
>> xen_get_vendor() seems to have been DCE-ed.  There is some new code
>> that hardcodes features - "x86/cpufeatures: Add {REQUIRED,DISABLED}
>> feature configs" - which may be responsible.
>
> 6.6.74 is broken too.  (That's the revision that the ARM tests want). 
> So it broke somewhere between .56 and .74 which narrows the bisect a little.
>
> https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/1761323774
>
> In Gitlab, both AMD and Intel are failing in roughly the same way.
>
> ~Andrew

I've bisected the tags and it was was introduced somewhere between the
v6.6.66 and the v6.6.67 tags.

The hypercall page was removed very shortly before v6.6.67 was tagged,
so I have a nagging suspicion...

Cheers,
Alejandro

Reply via email to