On 23.08.2024 17:08, Andrew Cooper wrote:
> On 14/08/2024 9:51 am, Jan Beulich wrote:
>> In preparation of introducing a const struct cpu_policy * local in
>> x86_emulate(), rename that global variable to something more suitable:
>> "cp" is our commonly used name for function parameters or local
>> variables of type struct cpu_policy *, and the present name of the
>> global could hence have interfered already.
>>
>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
>> ---
>> v2: Re-base over dropping of Xeon Phi support.
> 
> Bah - still need to reinstate part of that patch.  The model numbers
> need to stay.  /me adds it to the todo list.
> 
> For this patch, Acked-by: Andrew Cooper <andrew.coop...@citrix.com>,
> with one request for this patch a couple of thoughts for separate patches.

Thanks; see below for a clarifying question as to which one is which.

>> --- a/tools/tests/x86_emulator/x86-emulate.h
>> +++ b/tools/tests/x86_emulator/x86-emulate.h
>> @@ -123,7 +123,7 @@ static inline uint64_t xgetbv(uint32_t x
>>  }
>>  
>>  /* Intentionally checking OSXSAVE here. */
>> -#define cpu_has_xsave     (cp.basic.raw[1].c & (1u << 27))
>> +#define cpu_has_xsave     (cpu_policy.basic.raw[1].c & (1u << 27))
> 
> Right now we deliberately don't emit names for APIC, OSXSAVE and OSPKE,
> because the values won't be correct in a guest's policy.  But this is a
> legitimate corner case.
> 
> What about this?
> 
> diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
> index 601eec608983..1b56958f07e7 100755
> --- a/xen/tools/gen-cpuid.py
> +++ b/xen/tools/gen-cpuid.py
> @@ -382,10 +382,11 @@ def crunch_numbers(state):
>              if name and name[0] in "0123456789":
>                  name = "_" + name
>  
> -            # Don't generate names for features fast-forwarded from other
> -            # state
> +            # For dynamic features (fast forwarded from other state), also
> +            # prefix with an underscore to highlight that extra care
> needs to
> +            # be taken.
>              if name in ("APIC", "OSXSAVE", "OSPKE"):
> -                name = ""
> +                name = "_" + name
>  
>              names.append(name.lower())
>  
> which would make the expression become cpu_policy.basic._osxsave and
> avoid the opencoded number.

I'm not overly happy with that, not the least because of the pre-existing
case where an underscore is being prepended. If at all, I'd like these to
stand out even more. And I would have less reservations if we could limit
this to the header(s) that are generated for the tool stack, keeping the
hypervisor "immune" to mistakes in this area.

If it's just the open-coded number, I could probably introduce
cpufeat_mask() locally here. Not sure in how far the open-coding of the
field is also of concern to you.

>> @@ -133,63 +133,63 @@ static inline bool xcr0_mask(uint64_t ma
>>  unsigned int rdpkru(void);
>>  void wrpkru(unsigned int val);
>>  
>> -#define cache_line_size() (cp.basic.clflush_size * 8)
>> -#define cpu_has_fpu        cp.basic.fpu
>> -#define cpu_has_mmx        cp.basic.mmx
>> -#define cpu_has_fxsr       cp.basic.fxsr
>> -#define cpu_has_sse        cp.basic.sse
>> -#define cpu_has_sse2       cp.basic.sse2
>> -#define cpu_has_sse3       cp.basic.sse3
>> -#define cpu_has_pclmulqdq  cp.basic.pclmulqdq
>> -#define cpu_has_ssse3      cp.basic.ssse3
>> -#define cpu_has_fma       (cp.basic.fma && xcr0_mask(6))
>> -#define cpu_has_sse4_1     cp.basic.sse4_1
>> -#define cpu_has_sse4_2     cp.basic.sse4_2
>> -#define cpu_has_popcnt     cp.basic.popcnt
>> -#define cpu_has_aesni      cp.basic.aesni
>> -#define cpu_has_avx       (cp.basic.avx  && xcr0_mask(6))
>> -#define cpu_has_f16c      (cp.basic.f16c && xcr0_mask(6))
>> -
>> -#define cpu_has_avx2      (cp.feat.avx2 && xcr0_mask(6))
>> -#define cpu_has_bmi1       cp.feat.bmi1
>> -#define cpu_has_bmi2       cp.feat.bmi2
>> -#define cpu_has_avx512f   (cp.feat.avx512f  && xcr0_mask(0xe6))
>> -#define cpu_has_avx512dq  (cp.feat.avx512dq && xcr0_mask(0xe6))
>> -#define cpu_has_avx512_ifma (cp.feat.avx512_ifma && xcr0_mask(0xe6))
>> -#define cpu_has_avx512cd  (cp.feat.avx512cd && xcr0_mask(0xe6))
>> -#define cpu_has_sha        cp.feat.sha
>> -#define cpu_has_avx512bw  (cp.feat.avx512bw && xcr0_mask(0xe6))
>> -#define cpu_has_avx512vl  (cp.feat.avx512vl && xcr0_mask(0xe6))
>> -#define cpu_has_avx512_vbmi (cp.feat.avx512_vbmi && xcr0_mask(0xe6))
>> -#define cpu_has_avx512_vbmi2 (cp.feat.avx512_vbmi2 && xcr0_mask(0xe6))
>> -#define cpu_has_gfni       cp.feat.gfni
>> -#define cpu_has_vaes      (cp.feat.vaes && xcr0_mask(6))
>> -#define cpu_has_vpclmulqdq (cp.feat.vpclmulqdq && xcr0_mask(6))
>> -#define cpu_has_avx512_vnni (cp.feat.avx512_vnni && xcr0_mask(0xe6))
>> -#define cpu_has_avx512_bitalg (cp.feat.avx512_bitalg && xcr0_mask(0xe6))
>> -#define cpu_has_avx512_vpopcntdq (cp.feat.avx512_vpopcntdq && 
>> xcr0_mask(0xe6))
>> -#define cpu_has_movdiri    cp.feat.movdiri
>> -#define cpu_has_movdir64b  cp.feat.movdir64b
>> -#define cpu_has_avx512_vp2intersect (cp.feat.avx512_vp2intersect && 
>> xcr0_mask(0xe6))
>> -#define cpu_has_serialize  cp.feat.serialize
>> -#define cpu_has_avx512_fp16 (cp.feat.avx512_fp16 && xcr0_mask(0xe6))
>> -#define cpu_has_sha512     (cp.feat.sha512 && xcr0_mask(6))
>> -#define cpu_has_sm3        (cp.feat.sm3 && xcr0_mask(6))
>> -#define cpu_has_sm4        (cp.feat.sm4 && xcr0_mask(6))
>> -#define cpu_has_avx_vnni   (cp.feat.avx_vnni && xcr0_mask(6))
>> -#define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6))
>> -#define cpu_has_avx_ifma   (cp.feat.avx_ifma && xcr0_mask(6))
>> -#define cpu_has_avx_vnni_int8 (cp.feat.avx_vnni_int8 && xcr0_mask(6))
>> -#define cpu_has_avx_ne_convert (cp.feat.avx_ne_convert && xcr0_mask(6))
>> -#define cpu_has_avx_vnni_int16 (cp.feat.avx_vnni_int16 && xcr0_mask(6))
> 
> Can we take the opportunity to realign these vertically?

I've done this, and I assume that this was the "for this patch" request. I'm
not overly happy with the result, because of the now definitely necessary
line wrapping, but it's perhaps still an improvement. (It's likely going to
bite me when re-basing this ahead of other patches I have it sit behind of
right now.)

> Also, we have X86_XCR0_* constants in a helpful place now.  Could we do
> something like:
> 
> #define XCR0_AVX xcr0_mask(X86_XCR0_YMM | X86_XCR0_SSE)
> #define XCR0_AVX512 ...
> 
> So that these expressions now read
> 
> #define cpu_has_$X (... && XCR0_AVX)
> 
> rather than forcing the reader to know %xcr0 by raw hex?

I can do that, yet I probably wouldn't want to fold the xcr0_mask()
invocations into the macro expansions. Would that be okay with you?

Jan

Reply via email to