Oh cool, thanks a lot for the explanation.
I added the "vzeroupper" and Xen crashes so it looks like the CPUID
emulation is buggy. Also I was able to try it using a VM (same debian
testing) running on virt-manager+kvm and it works fine (Xen in debug mode).
I will have a look by printing the xstate when running on virt-manager+KVM
and I will also run the xen-cpuid command to see the difference just by
curiosity as with your test we already spotted the issue.
Thanks again for your enlightenment. I will continue my testing later today
and if you need me to test something else you are welcome, just ask I will
do my best.

Guillaume

On Sun, Feb 2, 2025 at 6:32 PM Andrew Cooper <andrew.coop...@citrix.com>
wrote:

> On 02/02/2025 4:58 pm, Guillaume wrote:
> > I attached the output of the `xl dmesg`. This is the 4.19.1 kernel I
> > rebuild but I have the same issue with master (just for info).
>
> Thanks.  This is a TigerLake CPU, and:
>
> > (XEN) Mitigating GDS by disabling AVX while virtualised - protections
> > are best-effort
>
> is why Xen is ignoring AVX.
>
> Now, as to the bug.  From the panic line, you're seeing:
>
> > XSTATE 0x0000000000000003, uncompressed hw size 0x340 != xen size 0x240
>
> xstate is XCR0_SSE | XCR0_X87, and the correct size for this
> configuration is 0x240.
>
> There reason why it matters is because this is the amount of data the
> processor will write out/read in for the XSAVE/XRSTOR instructions,
> which are used for context switching.  These instructions are also
> available in userspace.
>
> Here, VirtualBox is claiming that with AVX disabled, it will still write
> out the AVX registers.  This is buggy, but we're going to have to narrow
> it down further.
>
> Can you try building Xen with this additional line
>
> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> index af9e345a7ace..5a5011ba8b10 100644
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -789,6 +789,8 @@ static void __init noinline xstate_check_sizes(void)
>       */
>      check_new_xstate(&s, X86_XCR0_SSE | X86_XCR0_X87);
>
> +    asm volatile ("vzeroupper");
> +
>      if ( cpu_has_avx )
>          check_new_xstate(&s, X86_XCR0_YMM);
>
>
> and see if the result crashes or boots?
>
> One possible bug is that VirtualBox is shadowing XCR0 and the real
> setting in hardware is 0x7 (including XCR0_AVX) rather than 0x3.  In
> this case, the reported size is correct, and VirtualBox is failing to
> honour the XSETBV setting.
>
> Alternatively, another bug is that XCR0 is really 0x3, but the CPUID
> emulation for max size is wrong, in which case the XSAVE/etc
> instructions wont actually access beyond 0x240, and "all" that's wrong
> is that we'll allocate a larger buffer than necessary.
>
> The VZEROUPPER (an AVX instruction) should distinguish these two cases.
> If Xen crashes with it in place, then the XCR0 register is correct and
> it's CPUID which is buggy.  If Xen boots with that in place, then
> Virtualbox is shadowing XCR0 with a different value behind Xen's back.
>
> ~Andrew
>

Reply via email to