Hi, Xiaoyao
On 10/12/24 下午3:13, Xiaoyao Li wrote:
On 10/9/2024 11:56 AM, Chuang Xu wrote:
When QEMU is started with:
-cpu host,migratable=on,host-cache-info=on,l3-cache=off
-smp 180,sockets=2,dies=1,cores=45,threads=2
On Intel platform:
CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
logical processors in the physical package".
When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of
90 for
CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
CPUID.04H.EAX[31:26], which matches the expected result.
As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer too. Otherwise we may encounter unexpected results in guest.
For example, when QEMU is started with CLI above and xtopology is
disabled,
guest kernel 5.15.120 uses
CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
calculate threads-per-core in detect_ht(). Then guest will get
"90/(1+63)=1"
as the result, even though threads-per-core should actually be 2.
And on AMD platform:
CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
result meets our expectation.
So for AMD platform, what's result for the same situation with
xtopology disabled? Does AMD uses another algorithm to calculate other
than CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) ?
For amd platform, CPUID.04H is reserved, so it uses
CPUID.8000001E.EAX[15:8] (fied ThreadsPerComputeUnit) to obtain the result.
So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer
only for Intel platform to solve the unexpected result.
Reviewed-by: Zhao Liu <zhao1....@intel.com>
Acked-by: Igor Mammedov <imamm...@redhat.com>
Signed-off-by: Guixiong Wei <weiguixi...@bytedance.com>
Signed-off-by: Yipeng Yin <yinyip...@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxc...@bytedance.com>
---
target/i386/cpu.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff227a8c5c..641d4577b0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
index, uint32_t count,
}
*edx = env->features[FEAT_1_EDX];
if (threads_per_pkg > 1) {
- *ebx |= threads_per_pkg << 16;
+ /*
+ * AMD requires logical processor count, but Intel needs
maximum
+ * number of addressable IDs for logical processors per
package.
+ */
+ if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
+ *ebx |= threads_per_pkg << 16;
+ } else {
+ *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
+ }
you need to handle the overflow case when the number of logical
processors > 255.
It seems other cpuid cases of bit shifting don't condiser the overflow
case too..
Since intel only reserves 8bits for this field, do you have any
suggestions to make sure this field emulated
correctly?
*edx |= CPUID_HT;
}
if (!cpu->enable_pmu) {