On 23/08/2022 07:42, Jan Beulich wrote: > While the SDM isn't very clear about this, our present behavior make > Linux 5.19 unhappy. As of commit 8ad7e8f69695 ("x86/fpu/xsave: Support > XSAVEC in the kernel") they're using this CPUID output also to size > the compacted area used by XSAVEC. Getting back zero there isn't really > liked, yet fpr PV that's the default on capable hardware: XSAVES isn't
for. > exposed to PV domains. > > Considering that the size reported is that of the compacted save area, > I view Linux'es assumption as appropriate (short of the SDM properly > considering the case). Therefore we need to populate the field also when > only XSAVEC is supported for a guest. This is a mess. The SDM is fairly clear (but only in Vol1) that this leaf is specific to XSAVES. The APM has only an equation, which shows it as the compacted size without reference to instructions. Ideally I'd like the opinion from some architects and a clarification to the SDM... > Fixes: 460b9a4b3630 ("x86/xsaves: enable xsaves/xrstors for hvm guest") > Fixes: 8d050ed1097c ("x86: don't expose XSAVES capability to PV guests") > Signed-off-by: Jan Beulich <jbeul...@suse.com> CC Marek. Looks like Jan has found the issue you reported on IRC. Jan: Be aware that I submitted https://lore.kernel.org/lkml/20220810221909.12768-1-andrew.coop...@citrix.com/ to Linux to correct some of the diagnostics. > --- > I actually wonder why we surface the XSAVES feature bit to HVM domains, > when we don't support any of the features. Because that's what was originally accepted into Xen, and I couldn't retract it when fixing CPUID handling at first because it would regress across migrate to a newer Xen. With CPUID data now in the migration stream, we could in principle fix it, but at this point it's definitely not worth the complexity or risk to adjust. > It's solely because of this > that by default only PV domains are affected by the issue (HVM would be > affected only when XSAVES was hidden via guest config settings). > Wouldn't we better mask the bit (e.g. in recalculate_xstate()) when we > find that no features requiring XSAVES are visible to the domain? That > would likely come closer to real hardware, which pretty certainly won't > offer XSAVES without also offering at least one dependent feature. > > --- a/xen/arch/x86/cpuid.c > +++ b/xen/arch/x86/cpuid.c > @@ -1142,7 +1142,7 @@ void guest_cpuid(const struct vcpu *v, u > switch ( subleaf ) > { > case 1: > - if ( p->xstate.xsaves ) > + if ( p->xstate.xsavec || p->xstate.xsaves ) If we're doing this, then it wants to be xsavec only, with the comment being extended to explain why. But this is going to further complicate my several-year-old series trying to get Xen's XSTATE handling into a position where we can start to offer supervisor states. > { > /* > * TODO: Figure out what to do for XSS state. VT-x manages