On 11/07/2024 12:09 pm, Fonyuy-Asheri Caleb wrote: > ----- Original Message ----- >> From: "Andrew Cooper" <andrew.coop...@citrix.com> >> To: "Fonyuy-Asheri Caleb" <fonyuy-asheri.ca...@inria.fr>, "xen-devel" >> <xen-devel@lists.xenproject.org> >> Sent: Thursday, July 11, 2024 12:45:18 PM >> Subject: Re: Help with Understanding vcpu xstate restore error during vm >> migration >> On 11/07/2024 11:38 am, Fonyuy-Asheri Caleb wrote: >>> Hello, >>> >>> I am trying to understand the causes of the vcpu xstate restore error >>> during live migration. >>> I get the following error during live migration: >>> >>> xc: error: Failed to set vcpu0's xsave info (22 = Invalid argument): >>> Internal error >>> >>> I was able to locate the failure point to the file >>> xen/arch/x86/domctl.c with the following check. >>> >>> if( evc->size<PV_XSAVE_HDR_SIZE|| >>> evc->size>PV_XSAVE_SIZE(xfeature_mask) ) >>> gotovcpuextstate_out; >>> >>> I know this is related to the number of xstates handled by the source >>> server. Please can >>> someone explain to me how these states are computed? >>> >>> I earlier thought it was simply the number xsave dependent features on >>> the CPU but it seems >>> to be more than that. >>> >>> Thanks in advance. >> It is certainly more complicated than that. >> >> What that's saying is that Xen doesn't think that the size of the blob >> matches expectations. That said - I'm in the middle of rewriting this >> logic because lots of it is subtly wrong. > Please do you mind giving me more insight on the logic currently implemented > and maybe what is wrong with it? It will be important for me since what I'm > doing is research work.
See 9e6dbbe8bf40^..267122a24c49 > How do the values evc->size and xfeature_mask relate to the source and target > processor xstates (or xstate management)? The lower bounds check is for normal reasons, while the upper bounds check is a sanity "does this image appear to have more states active than the current system". The upper bound is bogus, because "what this VM has" has no true relationship to "what Xen decided to turn on by default at boot". >> To start with, which version (or versions?) of Xen, and what hardware? > Xen version 4.18.3-pre As you're not on a specific tag, exact changeset? Not that it likely matters - there shouldn't be anything relevant in staging-4.18 since RELEASE-4.18.2 as far as this goes. There are backports of 2 of bugfixes, but in a way that should be practical change on 4.18. > My CPU is : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz Ok, so Haswell. Let me stare at the CPUID dumps and see if anything stands out. ~Andrew