Hi, Xiaoyao

On 10/12/24 下午3:13, Xiaoyao Li wrote:
On 10/9/2024 11:56 AM, Chuang Xu wrote:
When QEMU is started with:
-cpu host,migratable=on,host-cache-info=on,l3-cache=off
-smp 180,sockets=2,dies=1,cores=45,threads=2

On Intel platform:
CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
logical processors in the physical package".

When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
CPUID.04H.EAX[31:26], which matches the expected result.

As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer too. Otherwise we may encounter unexpected results in guest.

For example, when QEMU is started with CLI above and xtopology is disabled, guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
as the result, even though threads-per-core should actually be 2.

And on AMD platform:
CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
result meets our expectation.

So for AMD platform, what's result for the same situation with xtopology disabled? Does AMD uses another algorithm to calculate other than CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) ?

For amd platform, CPUID.04H is reserved, so it uses CPUID.8000001E.EAX[15:8] (fied ThreadsPerComputeUnit) to obtain the result.
So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer
only for Intel platform to solve the unexpected result.

Reviewed-by: Zhao Liu <zhao1....@intel.com>
Acked-by: Igor Mammedov <imamm...@redhat.com>
Signed-off-by: Guixiong Wei <weiguixi...@bytedance.com>
Signed-off-by: Yipeng Yin <yinyip...@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxc...@bytedance.com>
---
  target/i386/cpu.c | 10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff227a8c5c..641d4577b0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
          }
          *edx = env->features[FEAT_1_EDX];
          if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            /*
+             * AMD requires logical processor count, but Intel needs maximum +             * number of addressable IDs for logical processors per package.
+             */
+            if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
+                *ebx |= threads_per_pkg << 16;
+            } else {
+                *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
+            }

you need to handle the overflow case when the number of logical processors > 255.

It seems other cpuid cases of bit shifting don't condiser the overflow case too..

Since intel only reserves 8bits for this field, do you have any suggestions to make sure this field emulated

correctly?

              *edx |= CPUID_HT;
          }
          if (!cpu->enable_pmu) {


Reply via email to