On Fri, Jul 02, 2021 at 01:14:56PM +0800, zhenwei pi wrote: > On 7/2/21 4:35 AM, Michael Roth wrote: > > Quoting Igor Mammedov (2021-07-01 03:43:13) > > > On Wed, 30 Jun 2021 14:18:09 -0500 > > > Michael Roth <michael.r...@amd.com> wrote: > > > > > > > Quoting Dr. David Alan Gilbert (2021-06-29 09:06:02) > > > > > * zhenwei pi (pizhen...@bytedance.com) wrote: > > > > > > A AMD server typically has cpuid level 0x10(test on Rome/Milan), it > > > > > > should not be changed to 0x1f in multi-dies case. > > > > > > > > > > > > Fixes: a94e1428991 (target/i386: Add CPUID.1F generation support > > > > > > for multi-dies PCMachine) > > > > > > Signed-off-by: zhenwei pi <pizhen...@bytedance.com> > > > > > > > > > > (Copying in Babu) > > > > > > > > > > Hmm I think you're right. I've cc'd in Babu and Wei. > > > > > > > > > > Eduardo: What do we need to do about compatibility, do we need to wire > > > > > this to machine type or CPU version? > > > > > > > > FWIW, there are some other CPUID entries like leaves 2 and 4 that are > > > > also Intel-specific. With SEV-SNP CPUID enforcement, advertising them to > > > > guests will result in failures when host SNP firmware checks the > > > > hypervisor-provided CPUID values against the host-supported ones. > > > > > > > > To address this we've been planning to add an 'amd-cpuid-only' property > > > > to suppress them: > > > > > > > > > > > > https://github.com/mdroth/qemu/commit/28d0553fe748d30a8af09e5e58a7da3eff03e21b > > > > > > > > My thinking is this property should be off by default, and only defined > > > > either via explicit command-line option, or via new CPU types. We're > > > > also > > > > planning to add new CPU versions for EPYC* CPU types that set this > > > > 'amd-cpuid-only' property by default: > > > > > > > > https://github.com/mdroth/qemu/commits/new-cpu-types-upstream > > > It look like having new cpu versions is enough to change behavior, > > > maybe keep 'amd-cpuid-only' as internal field and not expose it to users > > > as a property. > > > > Hmm, I defined it as a property mainly to make use of > > X86CPUVersionDefinition.props to create new versions of the CPU types > > with those properties set. > > > > There's a patch there that adds X86CPUVersionDefinition.cache_info so > > that new cache definitions can be provided for new CPU versions. So > > would you suggest a similar approach here, e.g. adding an > > X86CPUVersionDefinition.amd_cpuid_only field that could be used directly > > rather than going through X86CPUVersionDefinition.props? > > > > There's also another new "amd-xsave" prop in that series that does something > > similar to "amd-cpuid-only", so a little worried about tacking to much extra > > into X86CPUVersionDefinition. But maybe that one could just be rolled into > > "amd-cpuid-only" since it is basically fixing up xsave-related cpuid > > entries for AMD... > > > Hi, this patch wants to fix the issue: > AMD CPU (Rome/Milan) should get the cpuid level 0x10, not 0x1F in the guest. > If QEMU reports a 0x1F to guest, guest(Linux) would use leaf 0x1F instead of > leaf 0xB to get extended topology: > > https://github.com/torvalds/linux/blob/master/arch/x86/kernel/cpu/topology.c#L49 > > static int detect_extended_topology_leaf(struct cpuinfo_x86 *c) > { > if (c->cpuid_level >= 0x1f) { > if (check_extended_topology_leaf(0x1f) == 0) > return 0x1f; > } > > if (c->cpuid_level >= 0xb) { > if (check_extended_topology_leaf(0xb) == 0) > return 0xb; > } > > return -1; > } > > Because of the wrong cpuid level, the guest gets unexpected topology from > leaf 0x1F. > > I tested https://github.com/mdroth/qemu/commits/new-cpu-types-upstream, and > it seems that these patches could not fix this issue.
Yes, I think your patch would still be needed. The question is whether it's okay to change it for existing CPU types, e.g. EPYC-Milan, or only for new ones when they set a certain flag/property, like the proposed "amd-cpuid-only" (which the proposed EPYC-Milan-v2 would set). > > > > > > > > So in general I think maybe this change should be similarly controlled > > > > by > > > > this proposed 'amd-cpuid-only' property. Maybe for this particular case > > > > it's > > > > okay to do it unconditionally, but it sounds bad to switch up the valid > > > > CPUID > > > > range after a guest has already booted (which might happen with old->new > > > > migration for instance), since it might continue treating values in the > > > > range > > > > as valid afterward (but again, not sure that's the case here or not). > > > > > > > > There's some other changes with the new CPU types that we're still > > > > considering/testing internally, but should be able to post them in some > > > > form > > > > next week. > > > > > > > > -Mike > > > > > > > > > > > > > > Dave > > > > > > --- > > > > > > target/i386/cpu.c | 8 ++++++-- > > > > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > > > > > > index a9fe1662d3..3934c559e4 100644 > > > > > > --- a/target/i386/cpu.c > > > > > > +++ b/target/i386/cpu.c > > > > > > @@ -5961,8 +5961,12 @@ void x86_cpu_expand_features(X86CPU *cpu, > > > > > > Error **errp) > > > > > > } > > > > > > } > > > > > > - /* CPU topology with multi-dies support requires > > > > > > CPUID[0x1F] */ > > > > > > - if (env->nr_dies > 1) { > > > > > > + /* > > > > > > + * Intel CPU topology with multi-dies support requires > > > > > > CPUID[0x1F]. > > > > > > + * For AMD Rome/Milan, cpuid level is 0x10, and guest OS > > > > > > should detect > > > > > > + * extended toplogy by leaf 0xB. Only adjust it for Intel > > > > > > CPU. > > > > > > + */ > > > > > > + if ((env->nr_dies > 1) && IS_INTEL_CPU(env)) { > > > > > > x86_cpu_adjust_level(cpu, &env->cpuid_min_level, > > > > > > 0x1F); > > > > > > } > > > > > > -- > > > > > > 2.25.1 > > > > > > > > > > > -- > > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > > > > > > > > > > > > > > > -- > zhenwei pi >