On 11.01.2021 09:45, Paul Durrant wrote:
>> -----Original Message-----
>> From: Igor Druzhinin <igor.druzhi...@citrix.com>
>> Sent: 09 January 2021 00:56
>> To: p...@xen.org; xen-devel@lists.xenproject.org
>> Cc: w...@xen.org; i...@xenproject.org; anthony.per...@citrix.com; 
>> andrew.coop...@citrix.com;
>> george.dun...@citrix.com; jbeul...@suse.com; jul...@xen.org; 
>> sstabell...@kernel.org;
>> roger....@citrix.com
>> Subject: Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per 
>> partition
>>
>> On 08/01/2021 08:32, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Igor Druzhinin <igor.druzhi...@citrix.com>
>>>> Sent: 08 January 2021 00:47
>>>> To: xen-devel@lists.xenproject.org
>>>> Cc: p...@xen.org; w...@xen.org; i...@xenproject.org; 
>>>> anthony.per...@citrix.com;
>>>> andrew.coop...@citrix.com; george.dun...@citrix.com; jbeul...@suse.com; 
>>>> jul...@xen.org;
>>>> sstabell...@kernel.org; roger....@citrix.com; Igor Druzhinin 
>>>> <igor.druzhi...@citrix.com>
>>>> Subject: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per 
>>>> partition
>>>>
>>>> TLFS 7.8.1 stipulates that "a virtual processor index must be less than
>>>> the maximum number of virtual processors per partition" that "can be 
>>>> obtained
>>>> through CPUID leaf 0x40000005". Furthermore, "Requirements for Implementing
>>>> the Microsoft Hypervisor Interface" defines that starting from Windows 
>>>> Server
>>>> 2012, which allowed more than 64 CPUs to be brought up, this leaf can now
>>>> contain a value -1 basically assuming the hypervisor has no restriction 
>>>> while
>>>> 0 (that we currently expose) means the default restriction is still 
>>>> present.
>>>>
>>>> Along with previous changes exposing ExProcessorMasks this allows a recent
>>>> Windows VM with Viridian extension enabled to have more than 64 vCPUs 
>>>> without
>>>> going into immediate BSOD.
>>>>
>>>
>>> This is very odd as I was happily testing with a 128 vCPU VM once 
>>> ExProcessorMasks was
>> implemented... no need for the extra leaf.
>>> The documentation for 0x40000005 states " Describes the scale limits 
>>> supported in the current
>> hypervisor implementation. If any
>>> value is zero, the hypervisor does not expose the corresponding 
>>> information". It does not say (in
>> section 7.8.1 or elsewhere AFAICT)
>>> what implications that has for VP_INDEX.
>>>
>>> In " Requirements for Implementing the Microsoft Hypervisor Interface" I 
>>> don't see anything saying
>> what the semantics of not
>>> implementing leaf 0x40000005 are, only that if implementing it then -1 must 
>>> be used to break the 64
>> VP limit. It also says that
>>> reporting -1 in 0x40000005 means that 40000004.EAX bits 1 and 2 *must* be 
>>> clear, which is clearly
>> not true if ExProcessorMasks is
>>> enabled. That document is dated June 13th 2012 so I think it should be 
>>> taken with a pinch of salt.
>>>
>>> Have you actually observed a BSOD with >64 vCPUs when ExProcessorMasks is 
>>> enabled? If so, which
>> version of Windows? I'd like to get
>>> a repro myself.
>>
>> I return with more testing that confirm both my and your results.
>>
>> 1) with ExProcessorMask and 66 vCPUs assigned both current WS19 build and
>> pre-release 20270 of vNext boot correctly with no changes
>>
>> and that would be fine but the existing documentation for ex_processor_masks
>> states that:
>> "Hence this enlightenment must be specified for guests with more
>> than 64 vCPUs if B<hcall_remote_tlb_flush> and/or B<hcall_ipi> are also
>> specified."
>>
>> You then would expect 64+ vCPU VM to boot without ex_processors_mask,
>> hcall_remote_tlb_flush and hcall_ipi.
> 
> Indeed.
> 
>>
>> So,
>> 2) without ExProcessorMaks and 66 vCPUs assigned and with 
>> hcall_remote_tlb_flush
>> and hcall_ipi disabled: WS19 BSODs and vNext fails to initialize secondary 
>> CPUs
>>
> 
> This is not what I recall from testing but I can confirm I see the same now.
> 
>> After applying my change,
>> 3) without ExProcessorMaks and 66 vCPUs assigned and with 
>> hcall_remote_tlb_flush
>> and hcall_ipi disabled WS19 and vNext boot correctly
>>
>> So we either need to change documentation and require ExProcessorMasks for 
>> all
>> VMs with 64+ vCPUs or go with my change.
>>
> 
> I think we go with your change. I guess we can conclude then that the 
> separate viridian flag as part of the base set is the right way to go (to 
> stop the leaf magically appearing on migrate of existing guests) and leave 
> ex_processor_masks (and the documentation) as-is.
> 
> You can add my R-b to the patch.

That's the unchanged patch then, including the libxl change that
I had asked about and that I have to admit I don't fully follow
Igor's responses? I'm hesitant to give an ack for that aspect of
the change, yet I suppose the libxl maintainers will defer to
x86 ones there. Alternatively Andrew or Roger could of course
ack this ...

Jan

Reply via email to