On 07/04/2022 07:26, Jan Beulich wrote:
> On 07.04.2022 03:01, Andrew Cooper wrote:
>> c/s 1a914256dca5 increased the AMD max leaf from 0x8000001c to 0x80000021, 
>> but
>> did not adjust anything in the calculate_*_policy() chain.  As a result, on
>> hardware supporting these leaves, we read the real hardware values into the
>> raw policy, then copy into host, and all the way into the PV/HVM default
>> policies.
>>
>> All 4 of these leaves have enable bits (first two by TopoExt, next by SEV,
>> next by PQOS), so any software following the rules is fine and will leave 
>> them
>> alone.  However, leaf 0x8000001d takes a subleaf input and at least two
>> userspace utilities have been observed to loop indefinitely under Xen 
>> (clearly
>> waiting for eax to report "no more cache levels").
>>
>> Such userspace is buggy, but Xen's behaviour isn't great either.
>>
>> In the short term, clobber all information in these leaves.  This is a giant
>> bodge, but there are complexities with implementing all of these leaves
>> properly.
>>
>> Fixes: 1a914256dca5 ("x86/cpuid: support LFENCE always serialising CPUID 
>> bit")
>> Link: https://github.com/QubesOS/qubes-issues/issues/7392
>> Reported-by: fosslinux <fosslinux@aussies.space>
>> Reported-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> Reviewed-by: Jan Beulich <jbeul...@suse.com>

Thanks.

>
>> It turns out that Intel leaf 4 and AMD leaf 0x8000001d are *almost* 
>> identical.
>> They differ by the "complex" bit in edx, and the $X-per-cache fields in the
>> top of eax (Intel is threads-per-cache, AMD is cores-per-cache and lacks the
>> cores-per-package field).
>>
>> As neither vendor implement each others version, I'm incredibly tempted to
>> reuse p->cache for both, rather than doubling the storage space.  Reading the
>> data out is easy to key on p->extd.topoext.  Writing the data can be done
>> without any further complexity if we simply trust the sending side to have 
>> its
>> indices the proper way around.  Particularly, this avoids needing to ensure
>> that p->extd.topoext is out of order and at the head of the stream.  
>> Thoughts?
> Sounds quite reasonable to me. I guess the main risk is for new things
> to appear on either vendor's side in a way breaking the overlaying
> approach. But I guess that's not a significant risk.

Neither of the vendors are going to change it in incompatible ways to
how they currently expose it, and it's data that Xen doesn't
particularly care about it - we never interpret it on behalf of the guest.

When we fix the toolstack side of things to calculate topology properly,
the $foo-per-cache fields need adjusting, but that logic will be fine to
switch ( vendor ) on.  Since writing this, I found AMD's
cores-per-package and it's in the adjacent leaf with a wider field.

> As to ordering dependencies: Are there any in reality? Neither vendor
> implements the other vendor's leaf, so there's only going to be one in
> the stream anyway, and which one it is can be disambiguated by having
> seen leaf 0 alone.

The complexity is what (if anything) we do in
x86_cpuid_copy_from_buffer().  I've done some prototyping, and the
easiest option is to accept both 4 and e1Dd, in a latest-takes-precedent
manner precedent, and that we don't create interlinkage with the topoext
bit.

I've also got a pile of fixes to the unit tests so we hopefully can't
make mistakes like this again, although that will depend on getting
test-cpuid-policy running in OSSTest which is a todo list item which
really needs to get done.

~Andrew

Reply via email to