On 10/09/19 12:23, Igor Mammedov wrote: > On Tue, 8 Oct 2019 23:12:10 +0200 > Laszlo Ersek <ler...@redhat.com> wrote: > >> On 10/08/19 17:06, Igor Mammedov wrote: >>> On Tue, 8 Oct 2019 13:27:14 +0200 >>> Laszlo Ersek <ler...@redhat.com> wrote: >>> >>>> MaxCpuCountInitialization() currently handles the following options: >>>> >>>> (1) QEMU does not report the boot CPU count. >>>> >>>> In this case, PlatformPei makes MpInitLib enumerate APs up to the >>>> default PcdCpuMaxLogicalProcessorNumber value (64) minus 1, or until >>>> the default PcdCpuApInitTimeOutInMicroSeconds (50,000) elapses. >>>> (Whichever is reached first.) >>>> >>>> Time-limited AP enumeration had never been reliable on QEMU/KVM, which >>>> is why commit 45a70db3c3a5 strated handling case (2) below, in OVMF. >>>> >>>> (2) QEMU reports the boot CPU count. >>>> >>>> In this case, PlatformPei sets PcdCpuMaxLogicalProcessorNumber to the >>>> reported boot CPU count, and PcdCpuApInitTimeOutInMicroSeconds to >>>> practically "infinity" (MAX_UINT32, ~71 minutes). That causes >>>> MpInitLib to enumerate exactly the available (boot) APs. >>>> >>>> With CPU hotplug in mind, this method is not good enough. While >>>> UefiCpuPkg expects PcdCpuMaxLogicalProcessorNumber to include both >>>> boot (i.e., cold-plugged), and all *potentially* hot-plugged, logical >>>> processors, the boot CPU count reported by QEMU does not cover the >>>> second category. >>> >>> Can you elaborate why it doesn't cover the second category? >> >> (Elaborate just in this thread, or elaborate in the commit message?) >> >> The boot CPU count reported by QEMU does not cover all potentially >> hot-plugged logical processors ... because it is not supposed to. >> >> Namely, FW_CFG_NB_CPUS exposes "pcms->boot_cpus", and that is not >> supposed to cover all potentially hot-plugged CPUs, only the boot-time >> CPUs. > 'potentially hotplugged' confuses me greatly in this context > (I'm not sure if you talk about hotplugged or not existing yet CPUs). > > If it's about possible CPUs limit, I'd use "possible CPUs" > instead (wich includes present and not present CPUs).
Good idea. I was struggling to find the correct term. I realize the "possible CPUs" term has been used in QEMU forever. I'll update the commit message. > > >> Whereas PcdCpuMaxLogicalProcessorNumber needs to include all potentially >> hot-plugged CPUs too. For that, we need to expose "ms->smp.max_cpus" >> from QEMU. >> >> Does this answer your question? > You answered question better below > > [...] > >> When a CPU is hotplugged at OS runtime, "pcms->boot_cpus" increases in >> QEMU. >> Before the patch, the "pcms->boot_cpus" increase causes >> PcdCpuMaxLogicalProcessorNumber to increase as well. That breaks S3 >> resume. PcdCpuMaxLogicalProcessorNumber must not change from initial >> boot to S3 resume. > Probably something along this lines should be put in commit message, > which clearly states a requirement for max_cpus OK, I'll work that into the commit message too. Thanks! Laszlo > >> >> After the patch, the "pcms->boot_cpus" increase does not increase >> PcdCpuMaxLogicalProcessorNumber. PcdCpuMaxLogicalProcessorNumber remains >> the same (namely, "ms->smp.max_cpus"). Therefore, the CPU structures >> allocated during normal boot, for "ms->smp.max_cpus", will accommodate >> the CPUs that have been hotplugged prior to S3 resume. >> >> I tested the S3 behavior, plus the values mentioned above are also >> logged to the OVMF debug log, by this patch: >> >>> + DEBUG ((DEBUG_INFO, "%a: BootCpuCount=%d mMaxCpuCount=%u\n", >>> __FUNCTION__, >>> + BootCpuCount, mMaxCpuCount)); >> >> This is one libvirt domain XML snippet I've used for testing: >> >> <vcpu placement='static' current='3'>4</vcpu> >> <vcpus> >> <vcpu id='0' enabled='yes' hotpluggable='no'/> >> <vcpu id='1' enabled='yes' hotpluggable='no'/> >> <vcpu id='2' enabled='yes' hotpluggable='no'/> >> <vcpu id='3' enabled='no' hotpluggable='yes'/> >> </vcpus> >> <os> >> <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type> >> ... >> <cpu mode='custom' match='exact' check='partial'> >> <model fallback='allow'>Haswell-noTSX</model> >> <topology sockets='1' cores='2' threads='2'/> >> <feature policy='require' name='vmx'/> >> <feature policy='require' name='pcid'/> >> <feature policy='require' name='spec-ctrl'/> >> <feature policy='require' name='pdpe1gb'/> >> </cpu> >> >> And the command used for testing was: >> >> $ virsh setvcpu GUEST 3 --enable --live >> >> When this is done, the guest kernel dmesg confirms the CPU hotplug, but >> it is not onlined automatically. I onlined it via /sys manually in the >> guest. >> >> After that point, I exercised S3 suspend / resume. During S3 resume, the >> "BootCpuCount" log (from above) increases from 3 to 4 (and the related >> edk2 infrastructure messages are in sync). All the while "mMaxCpuCount" >> preserves its initial value, 4. >> >> Finally, after S3 resume, I ran "rdmsr -a 0x3a" in the guest, which >> printed "5" for all *four* CPUs. This confirms that OVMF has set the >> Feature Control MSR on the new VCPU too, during S3 resume. >> >> (The MSR is zeroed on platform reset, and has to be re-configured by >> firmware, on every CPU, during S3. It is set to 5 due to the "vmx" CPU >> feature in the above domain XML snippet. References: >> >> - https://bugzilla.tianocore.org/show_bug.cgi?id=86 >> - >> https://github.com/tianocore/tianocore.github.io/wiki/Testing-SMM-with-QEMU,-KVM-and-libvirt#confirm-simple-multiprocessing-during-boot >> >> The specific meaning of the MSR is totally irrelevant now, it's just a >> good use case for testing multi-processing during S3 resume. For MSR >> 0x3A (Feature Control), the firmware must execute the wrmsr >> instructionon on each CPU, and the "rdmsr -a" Linux userspace command >> reads back the MSR on each CPU separately.) >> >> Thanks! >> Laszlo > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#48661): https://edk2.groups.io/g/devel/message/48661 Mute This Topic: https://groups.io/mt/34441232/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-