On Wed, 9 Aug 2023 21:20:48 +0800 "Wen, Qian" <qian....@intel.com> wrote:
> On 8/9/2023 7:14 PM, Igor Mammedov wrote: > > On Wed, 9 Aug 2023 18:27:32 +0800 > > Qian Wen <qian....@intel.com> wrote: > > > >> The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM > >> Vol2: > >> > >> Bits 23-16: Maximum number of addressable IDs for logical processors in > >> this physical package. > >> > >> When launching the VM with -smp 256, the value written to EBX[23:16] is > >> 0 because of data overflow. If the guest only supports legacy topology, > >> without V2 Extended Topology enumerated by CPUID.0x1f or Extended > >> Topology enumerated by CPUID.0x0b to support over 255 CPUs, the return > >> of the kernel invoking cpu_smt_allowed() is false and AP's bring-up will > >> fail. Then only CPU 0 is online, and others are offline. > >> > >> To avoid this issue caused by overflow, limit the max value written to > >> EBX[23:16] to 255. > > what happens on real hw or in lack of thereof what SDM says about this > > value when there is more than 255 threads?. > > > > Current SDM doesn't specify what the value should be when APIC IDs per > package exceeds 255. So we asked the internal HW architect, the response is > that EBX[23:16] will report 255 instead of being truncated to a smaller value. then mention it in commit log so one wouldn't wonder where the value came from. > > Thanks, > Qian > > >> Signed-off-by: Qian Wen <qian....@intel.com> > >> --- > >> Changes v1 -> v2: > >> - Revise the commit message and comment to more clearer. > >> - Rebased to v8.1.0-rc2. > >> --- > >> target/i386/cpu.c | 16 ++++++++++++++-- > >> 1 file changed, 14 insertions(+), 2 deletions(-) > >> > >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c > >> index 97ad229d8b..6e1d88fbd7 100644 > >> --- a/target/i386/cpu.c > >> +++ b/target/i386/cpu.c > >> @@ -6008,6 +6008,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, > >> uint32_t count, > >> uint32_t die_offset; > >> uint32_t limit; > >> uint32_t signature[3]; > >> + uint32_t threads_per_socket; > >> X86CPUTopoInfo topo_info; > >> > >> topo_info.dies_per_pkg = env->nr_dies; > >> @@ -6049,8 +6050,19 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t > >> index, uint32_t count, > >> *ecx |= CPUID_EXT_OSXSAVE; > >> } > >> *edx = env->features[FEAT_1_EDX]; > >> - if (cs->nr_cores * cs->nr_threads > 1) { > >> - *ebx |= (cs->nr_cores * cs->nr_threads) << 16; > >> + /* > >> + * Only bits [23:16] represent the maximum number of addressable > >> + * IDs for logical processors in this physical package. > >> + * When thread_per_socket > 255, it will 1) overwrite bits[31:24] > >> + * which is apic_id, 2) bits [23:16] get truncated. > >> + */ > >> + threads_per_socket = cs->nr_cores * cs->nr_threads; > >> + if (threads_per_socket > 255) { > >> + threads_per_socket = 255; > >> + } > >> + > >> + if (threads_per_socket > 1) { > >> + *ebx |= threads_per_socket << 16; ^^^^^^^^^^^^^^^^^^^^^^^^^ more robust would be mask out non-relevant fields at rhs also perhaps double check if we could do induce similar overflow tweaking other -smp properties (todo for another patch[es] if there are such places). > >> *edx |= CPUID_HT; > >> } > >> if (!cpu->enable_pmu) {