On 28/10/2019 09:21, Jan Beulich wrote:
> On 25.10.2019 19:01, Andrew Cooper wrote:
>> On 24/10/2019 12:57, Steven Haigh wrote:
>>> Hi all,
>>>
>>> I've managed to get the git master version of Xen on this affected
>>> system and tries to boot a Windows Server 2016 system. It crashes as
>>> per normal.
>>>
>>> I managed to get these logs, but I'm not quite sure what else to do to
>>> debug this issue further.
>> After a collaborative debugging session on IRC, we've identified the
>> problem.  Here is a summary.
>>
>> https://www.reddit.com/r/Amd/comments/ckr5f4/amd_ryzen_3000_series_linux_support_and/
>> is concerning KVM, but it identified that the TOPOEXT feature was
>> important to getting windows to boot.
>>
>> Xen doesn't currently offer TOPOEXT to guests at all.  Fixing this is on
>> the TODO list along with the rest of the topology representation swamp.
>>
>> On a hunch, I offered up a XenServer patch which we are still using, in
>> lieu of fixing topology properly.  It is logically a revert of
>> ca2eee92df44 as that change wasn't migration safe.
> Would you mind helping me understand how this revert matches up with
> you saying above that TOPOEXT is needed for Windows to boot here? I
> don't think I can conclude anything in this direction from the article
> you've provided the link of.

TOPOEXT gave a clear hint that it was topology based, but beyond that,
its not no specific connected.

The revert clears HTT which is a key factor in AMD's algorithm of "how
to calculate topology".

>
>> With this patch in place, windows works fine.  However, I don't think
>> the patch is appropriate to take into 4.13.
>>
>> Furthermore, there is no chance of getting the topology work sorted in
>> the remaining 4.13 timeframe.
>>
>> I'm at a loss for ideas, other than release note it as broken and make
>> fixing it a blocker for 4.14.
> Would making conditional the currently unconditional setting of HT in
> the guest CPUID view together with the doubling of certain other fields'
> values perhaps similarly help?

Making that entire block be conditional is probably ok, but I can't
think of a way of doing safely doing this.  We definitely don't want to
put something like this into the libxl api, seeing as it is expected to
be obsoleted in the near future.

One option which was experimented with was clearing HTT using the cpuid=
control, but that didn't work.  I think a user HTT setting gets
clobbered by the later CPUID logic.  Perhaps that is something we could
bodge in a less bad way.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to