On 30.12.2025 14:54, Andrew Cooper wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -310,21 +310,21 @@ void xsave(struct vcpu *v, uint64_t mask)
>      uint32_t hmask = mask >> 32;
>      uint32_t lmask = mask;
>      unsigned int fip_width = v->domain->arch.x87_fip_width;
> -#define XSAVE(pfx) \
> -        if ( v->arch.xcr0_accum & XSTATE_XSAVES_ONLY ) \
> -            asm volatile ( ".byte " pfx "0x0f,0xc7,0x2f\n" /* xsaves */ \
> -                           : "=m" (*ptr) \
> -                           : "a" (lmask), "d" (hmask), "D" (ptr) ); \
> -        else \
> -            alternative_io(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
> -                           ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \
> -                           X86_FEATURE_XSAVEOPT, \
> -                           "=m" (*ptr), \
> -                           "a" (lmask), "d" (hmask), "D" (ptr))
> +
> +#define XSAVE(pfx)                                                      \
> +    if ( v->arch.xcr0_accum & XSTATE_XSAVES_ONLY )                      \
> +        asm volatile ( "xsaves %0"                                      \
> +                       : "=m" (*ptr)                                    \
> +                       : "a" (lmask), "d" (hmask) );                    \
> +    else                                                                \
> +        alternative_io("xsave %0",                                      \
> +                       "xsaveopt %0", X86_FEATURE_XSAVEOPT,             \
> +                       "=m" (*ptr),                                     \
> +                       "a" (lmask), "d" (hmask))

While no doubt neater to read this way, there's a subtle latent issue here:
"m" doesn't exclude RIP-relative addressing, yet that addressing form can't
be used in replacement code (up and until we leverage your decode-lite to
actually be able to fix up the displacement). Sadly "o" as a constraint
doesn't look to be any different in this regard (I think it should be, as
adding a "small integer" may already bring the displacement beyond the
permitted range, but their definition of "offsettable" allows this).

This issue is latent until such time that (a) a caller appears passing in
the address of a Xen-internal variable and (b) we make LTO work again.
Since the breakage would be impossible to notice at build time, I think we
would be better off if we avoided it from the beginning. Which may mean
sacrificing on code gen, by using "r" and then "(%0)" as the insn operand.

> @@ -489,17 +484,17 @@ void xrstor(struct vcpu *v, uint64_t mask)
>              ptr->xsave_hdr.xcomp_bv = 0;
>          }
>          memset(ptr->xsave_hdr.reserved, 0, sizeof(ptr->xsave_hdr.reserved));
> -        continue;
> +        goto retry;
>  
>      case 2: /* Stage 2: Reset all state. */
>          ptr->fpu_sse.mxcsr = MXCSR_DEFAULT;
>          ptr->xsave_hdr.xstate_bv = 0;
>          ptr->xsave_hdr.xcomp_bv = v->arch.xcr0_accum & XSTATE_XSAVES_ONLY
>              ? XSTATE_COMPACTION_ENABLED : 0;
> -        continue;
> -    }
> +        goto retry;
>  
> -        domain_crash(current->domain);
> +    default: /* Stage 3: Nothing else to do. */
> +        domain_crash(v->domain, "Uncorrectable XRSTOR fault\n");
>          return;

There's an unexplained change here as to which domain is being crashed.
You switch to crashing the subject domain, yet if that's not also the
requester, it isn't "guilty" in causing the observed fault.

Jan

Reply via email to