Hi Salil, > On 13 Jun 2024, at 23:36, Salil Mehta <salil.me...@huawei.com> wrote: > > PROLOGUE > ======== > > To assist in review and set the right expectations from this RFC, please first > read the sections *APPENDED AT THE END* of this cover letter: > > 1. Important *DISCLAIMER* [Section (X)] > 2. Work presented at KVMForum Conference (slides available) [Section (V)F] > 3. Organization of patches [Section (XI)] > 4. References [Section (XII)] > 5. Detailed TODO list of leftover work or work-in-progress [Section (IX)] > > There has been interest shown by other organizations in adapting this series > for their architecture. Hence, RFC V2 [21] has been split into architecture > *agnostic* [22] and *specific* patch sets. > > This is an ARM architecture-specific patch set carved out of RFC V2. Please > check section (XI)B for details of architecture agnostic patches. > > SECTIONS [I - XIII] are as follows: > > (I) Key Changes [details in last section (XIV)] > ============================================== > > RFC V2 -> RFC V3 > > 1. Split into Architecture *agnostic* (V13) [22] and *specific* (RFC V3) > patch sets. > 2. Addressed comments by Gavin Shan (RedHat), Shaoqin Huang (RedHat), > Philippe Mathieu-Daudé (Linaro), > Jonathan Cameron (Huawei), Zhao Liu (Intel). >
I’ve tested this series along with v10 kernel patches from [1] on the following items: Boot. Hotplug up to maxcpus. Hot unplug down to the number of boot cpus. Hotplug vcpus then migrate to a new VM. Hot unplug down to the number of boot cpus then migrate to a new VM. Up to 6 successive live migrations. And in which LGTM. Please feel free to add, Tested-by: Miguel Luis <miguel.l...@oracle.com> Regards, Miguel [1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/vcpu-hotplug > RFC V1 -> RFC V2 > > RFC V1: > https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.me...@huawei.com/ > > 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI > *online-capable* or *enabled* to the Guest OS at boot time. This means > associated CPUs can have ACPI _STA as *enabled* or *disabled* even after > boot. > See UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20]. > 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF} > request. This is required to {dis}allow online'ing a vCPU. > 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI > _STA.PRESENT > to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the > hot{un}plug. > 4. Live Migration works (some issues are still there). > 5. TCG/HVF/qtest does not support Hotplug and falls back to default. > 6. Code for TCG support exists in this release (it is a work-in-progress). > 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform > hotplug capability (_OSC Query support still pending). > 8. Misc. Bug fixes. > > (II) Summary > ============ > > This patch set introduces virtual CPU hotplug support for the ARMv8 > architecture > in QEMU. The idea is to be able to hotplug and hot-unplug vCPUs while the > guest VM > is running, without requiring a reboot. This does *not* make any assumptions > about > the physical CPU hotplug availability within the host system but rather tries > to > solve the problem at the virtualizer/QEMU layer. It introduces ACPI CPU > hotplug hooks > and event handling to interface with the guest kernel, and code to > initialize, plug, > and unplug CPUs. No changes are required within the host kernel/KVM except the > support of hypercall exit handling in the user-space/Qemu, which has recently > been added to the kernel. Corresponding guest kernel changes have been > posted on the mailing list [3] [4] by James Morse. > > (III) Motivation > ================ > > This allows scaling the guest VM compute capacity on-demand, which would be > useful for the following example scenarios: > > 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration > framework that could adjust resource requests (CPU and Mem requests) for > the containers in a pod, based on usage. > 2. Pay-as-you-grow Business Model: Infrastructure providers could allocate and > restrict the total number of compute resources available to the guest VM > according to the SLA (Service Level Agreement). VM owners could request more > compute to be hot-plugged for some cost. > > For example, Kata Container VM starts with a minimum amount of resources > (i.e., > hotplug everything approach). Why? > > 1. Allowing faster *boot time* and > 2. Reduction in *memory footprint* > > Kata Container VM can boot with just 1 vCPU, and then later more vCPUs can be > hot-plugged as needed. > > (IV) Terminology > ================ > > (*) Possible CPUs: Total vCPUs that could ever exist in the VM. This includes > any cold-booted CPUs plus any CPUs that could be later > hot-plugged. > - Qemu parameter (-smp maxcpus=N) > (*) Present CPUs: Possible CPUs that are ACPI 'present'. These might or might > not be ACPI 'enabled'. > - Present vCPUs = Possible vCPUs (Always on ARM Arch) > (*) Enabled CPUs: Possible CPUs that are ACPI 'present' and 'enabled' and can > now be ‘onlined’ (PSCI) for use by the Guest Kernel. All > cold- > booted vCPUs are ACPI 'enabled' at boot. Later, using > device_add, more vCPUs can be hotplugged and made ACPI > 'enabled'. > - Qemu parameter (-smp cpus=N). Can be used to specify some > cold-booted vCPUs during VM init. Some can be added using the > '-device' option. > > (V) Constraints Due to ARMv8 CPU Architecture [+] Other Impediments > =================================================================== > > A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint) > 1. ARMv8 CPU architecture does not support the concept of the physical CPU > hotplug. > a. There are many per-CPU components like PMU, SVE, MTE, Arch timers, > etc., > whose behavior needs to be clearly defined when the CPU is > hot(un)plugged. > There is no specification for this. > > 2. Other ARM components like GIC, etc., have not been designed to realize > physical CPU hotplug capability as of now. For example, > a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct. > Architecture does not specify what CPU hot(un)plug would mean in > context to any of these. > b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor). > GIC Redistributors are always part of the always-on power domain. > Hence, > they cannot be powered off as per specification. > > B. Impediments in Firmware/ACPI (Architectural Constraint) > > 1. Firmware has to expose GICC, GICR, and other per-CPU features like PMU, > SVE, MTE, Arch Timers, etc., to the OS. Due to the architectural > constraint > stated in section A1(a), all interrupt controller structures of > MADT describing GIC CPU Interfaces and the GIC Redistributors MUST be > presented by firmware to the OSPM during boot time. > 2. Architectures that support CPU hotplug can evaluate the ACPI _MAT method > to > get this kind of information from the firmware even after boot, and the > OSPM has the capability to process these. ARM kernel uses information in > MADT > interrupt controller structures to identify the number of present CPUs > during > boot and hence does not allow to change these after boot. The number of > present CPUs cannot be changed. It is an architectural constraint! > > C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural > Constraint) > > 1. KVM VGIC: > a. Sizing of various VGIC resources like memory regions, etc., related to > the redistributor happens only once and is fixed at the VM init time > and cannot be changed later after initialization has happened. > KVM statically configures these resources based on the number of vCPUs > and the number/size of redistributor ranges. > b. Association between vCPU and its VGIC redistributor is fixed at the > VM init time within the KVM, i.e., when redistributor iodevs gets > registered. VGIC does not allow to setup/change this association > after VM initialization has happened. Physically, every CPU/GICC is > uniquely connected with its redistributor, and there is no > architectural way to set this up. > 2. KVM vCPUs: > a. Lack of specification means destruction of KVM vCPUs does not exist as > there is no reference to tell what to do with other per-vCPU > components like redistributors, arch timer, etc. > b. In fact, KVM does not implement the destruction of vCPUs for any > architecture. This is independent of whether the architecture > actually supports CPU Hotplug feature. For example, even for x86 KVM > does not implement the destruction of vCPUs. > > D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch) > > 1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to > overcome the KVM constraint. KVM vCPUs are created and initialized when > Qemu > CPU Objects are realized. But keeping the QOM CPU objects realized for > 'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall > be plugged using device_add and a new QOM CPU object shall be created. > 2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs* > during VM init time while QOM GICV3 Object is realized. This is because > KVM VGIC can only be initialized once during init time. But every > GICV3CPUState has an associated QOM CPU Object. Later might correspond to > vCPU which are 'yet-to-be-plugged' (unplugged at init). > 3. How should new QOM CPU objects be connected back to the GICV3CPUState > objects and disconnected from it in case the CPU is being hot(un)plugged? > 4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the > QOM for which KVM vCPU already exists? For example, whether to keep, > a. No QOM CPU objects Or > b. Unrealized CPU Objects > 5. How should vCPU state be exposed via ACPI to the Guest? Especially for > the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exist > within the QOM but the Guest always expects all possible vCPUs to be > identified as ACPI *present* during boot. > 6. How should Qemu expose GIC CPU interfaces for the unplugged or > yet-to-be-plugged vCPUs using ACPI MADT Table to the Guest? > > E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D) > > 1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e., even > for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the > powered-off state. > 2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU > objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked > at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86) > 3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during > VM init time i.e., when Qemu GIC is realized. This, in turn, sizes KVM > VGIC > resources like memory regions, etc., related to the redistributors with > the > number of possible KVM vCPUs. This never changes after VM has > initialized. > 4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are > released post Host KVM CPU and GIC/VGIC initialization. > 5. Build ACPI MADT Table with the following updates: > a. Number of GIC CPU interface entries (=possible vCPUs) > b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) > c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1 > - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7] > - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) > - Some issues with above (details in later sections) > 6. Expose below ACPI Status to Guest kernel: > a. Always _STA.Present=1 (all possible vCPUs) > b. _STA.Enabled=1 (plugged vCPUs) > c. _STA.Enabled=0 (unplugged vCPUs) > 7. vCPU hotplug *realizes* new QOM CPU object. The following happens: > a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread. > b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list). > - Attaches to QOM CPU object. > c. Reinitializes KVM vCPU in the Host. > - Resets the core and sys regs, sets defaults, etc. > d. Runs KVM vCPU (created with "start-powered-off"). > - vCPU thread sleeps (waits for vCPU reset via PSCI). > e. Updates Qemu GIC. > - Wires back IRQs related to this vCPU. > - GICV3CPUState association with QOM CPU Object. > f. Updates [6] ACPI _STA.Enabled=1. > g. Notifies Guest about the new vCPU (via ACPI GED interface). > - Guest checks _STA.Enabled=1. > - Guest adds processor (registers CPU with LDM) [3]. > h. Plugs the QOM CPU object in the slot. > - slot-number = cpu-index {socket, cluster, core, thread}. > i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC). > - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check). > - Qemu powers-on KVM vCPU in the Host. > 8. vCPU hot-unplug *unrealizes* QOM CPU Object. The following happens: > a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event. > - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC). > b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check). > - Qemu powers-off the KVM vCPU in the Host. > c. Guest signals *Eject* vCPU to Qemu. > d. Qemu updates [6] ACPI _STA.Enabled=0. > e. Updates GIC. > - Un-wires IRQs related to this vCPU. > - GICV3CPUState association with new QOM CPU Object is updated. > f. Unplugs the vCPU. > - Removes from slot. > - Parks KVM vCPU ("kvm_parked_vcpus" list). > - Unrealizes QOM CPU Object & joins back Qemu vCPU thread. > - Destroys QOM CPU object. > g. Guest checks ACPI _STA.Enabled=0. > - Removes processor (unregisters CPU with LDM) [3]. > > F. Work Presented at KVM Forum Conferences: > ========================================== > > Details of the above work have been presented at KVMForum2020 and KVMForum2023 > conferences. Slides & video are available at the links below: > a. KVMForum 2023 > - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that > don't Support CPU Hotplug (like ARM64). > https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf > > https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf > https://www.youtube.com/watch?v=hyrw4j2D6I0&t=23970s > https://kvm-forum.qemu.org/2023/talk/9SMPDQ/ > b. KVMForum 2020 > - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like > ARM64) - Salil Mehta, Huawei. > https://sched.co/eE4m > > (VI) Commands Used > ================== > > A. Qemu launch commands to init the machine: > > $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \ > -cpu host -smp cpus=4,maxcpus=6 \ > -m 300M \ > -kernel Image \ > -initrd rootfs.cpio.gz \ > -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 > acpi=force" \ > -nographic \ > -bios QEMU_EFI.fd \ > > B. Hot-(un)plug related commands: > > # Hotplug a host vCPU (accel=kvm): > $ device_add host-arm-cpu,id=core4,core-id=4 > > # Hotplug a vCPU (accel=tcg): > $ device_add cortex-a57-arm-cpu,id=core4,core-id=4 > > # Delete the vCPU: > $ device_del core4 > > Sample output on guest after boot: > > $ cat /sys/devices/system/cpu/possible > 0-5 > $ cat /sys/devices/system/cpu/present > 0-5 > $ cat /sys/devices/system/cpu/enabled > 0-3 > $ cat /sys/devices/system/cpu/online > 0-1 > $ cat /sys/devices/system/cpu/offline > 2-5 > > Sample output on guest after hotplug of vCPU=4: > > $ cat /sys/devices/system/cpu/possible > 0-5 > $ cat /sys/devices/system/cpu/present > 0-5 > $ cat /sys/devices/system/cpu/enabled > 0-4 > $ cat /sys/devices/system/cpu/online > 0-1,4 > $ cat /sys/devices/system/cpu/offline > 2-3,5 > > Note: vCPU=4 was explicitly 'onlined' after hot-plug > $ echo 1 > /sys/devices/system/cpu/cpu4/online > > (VII) Latest Repository > ======================= > > (*) Latest Qemu RFC V3 (Architecture Specific) patch set: > https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v3 > (*) Latest Qemu V13 (Architecture Agnostic) patch set: > https://github.com/salil-mehta/qemu.git > virt-cpuhp-armv8/rfc-v3.arch.agnostic.v13 > (*) QEMU changes for vCPU hotplug can be cloned from below site: > https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2 > (*) Guest Kernel changes (by James Morse, ARM) are available here: > https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git > virtual_cpu_hotplug/rfc/v2 > (*) Leftover patches of the kernel are available here: > > https://lore.kernel.org/lkml/20240529133446.28446-1-jonathan.came...@huawei.com/ > > https://github.com/salil-mehta/linux/commits/virtual_cpu_hotplug/rfc/v6.jic/ > (not latest) > > (VIII) KNOWN ISSUES > =================== > > 1. Migration has been lightly tested but has been found working. > 2. TCG is broken. > 3. HVF and qtest are not supported yet. > 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable > are > mutually exclusive, i.e., as per the change [6], a vCPU cannot be both > GICC.Enabled and GICC.online-capable. This means: > [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ] > a. If we have to support hot-unplug of the cold-booted vCPUs, then these > MUST > be specified as GICC.online-capable in the MADT Table during boot by the > firmware/Qemu. But this requirement conflicts with the requirement to > support new Qemu changes with legacy OS that don't understand > MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this > bit, and hence these vCPUs will not appear on such OS. This is unexpected > behavior. > b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug > these cold-booted vCPUs from OS (which in actuality should be blocked by > returning error at Qemu), then features like 'kexec' will break. > c. As I understand, removal of the cold-booted vCPUs is a required feature > and x86 world allows it. > d. Hence, either we need a specification change to make the > MADT.GICC.Enabled > and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support > the removal of cold-booted vCPUs. In the latter case, a check can be > introduced > to bar the users from unplugging vCPUs, which were cold-booted, using QMP > commands. (Needs discussion!) > Please check the patch part of this patch set: > [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]. > > NOTE: This is definitely not a blocker! > 5. Code related to the notification to GICV3 about the hot(un)plug of a vCPU > event > might need further discussion. > > > (IX) THINGS TO DO > ================= > > 1. Fix issues related to TCG/Emulation support. (Not a blocker) > 2. Comprehensive Testing is in progress. (Positive feedback from Oracle & > Ampere) > 3. Qemu Documentation (.rst) needs to be updated. > 4. Fix qtest, HVF Support (Future). > 5. Update the design issue related to ACPI MADT.GICC flags discussed in known > issues. This might require UEFI ACPI specification change (Not a blocker). > 6. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now. (Not > a blocker). > > The above is *not* a complete list. Will update later! > > Best regards, > Salil. > > (X) DISCLAIMER > ============== > > This work is an attempt to present a proof-of-concept of the ARM64 vCPU > hotplug > implementation to the community. This is *not* production-level code and might > have bugs. Comprehensive testing is being done on HiSilicon Kunpeng920 SoC, > Oracle, and Ampere servers. We are nearing stable code and a non-RFC > version shall be floated soon. > > This work is *mostly* in the lines of the discussions that have happened in > the > previous years [see refs below] across different channels like the mailing > list, > Linaro Open Discussions platform, and various conferences like KVMForum, etc. > This > RFC is being used as a way to verify the idea mentioned in this cover letter > and > to get community views. Once this has been agreed upon, a formal patch shall > be > posted to the mailing list for review. > > [The concept being presented has been found to work!] > > (XI) ORGANIZATION OF PATCHES > ============================ > > A. Architecture *specific* patches: > > [Patch 1-8, 17, 27, 29] logic required during machine init. > (*) Some validation checks. > (*) Introduces core-id property and some util functions required later. > (*) Logic to pre-create vCPUs. > (*) GIC initialization pre-sized with possible vCPUs. > (*) Some refactoring to have common hot and cold plug logic together. > (*) Release of disabled QOM CPU objects in post_cpu_init(). > (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities. > [Patch 9-16] logic related to ACPI at machine init time. > (*) Changes required to Enable ACPI for CPU hotplug. > (*) Initialization of ACPI GED framework to cater to CPU Hotplug Events. > (*) ACPI MADT/MAT changes. > [Patch 18-26] logic required during vCPU hot-(un)plug. > (*) Basic framework changes to support vCPU hot-(un)plug. > (*) ACPI GED changes for hot-(un)plug hooks. > (*) Wire-unwire the IRQs. > (*) GIC notification logic. > (*) ARMCPU unrealize logic. > (*) Handling of SMCC Hypercall Exits by KVM to Qemu. > > B. Architecture *agnostic* patches: > > [PATCH V13 0/8] Add architecture agnostic code to support vCPU Hotplug. > > https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.me...@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7 > (*) Refactors vCPU create, Parking, unparking logic of vCPUs, and addition > of traces. > (*) Build ACPI AML related to CPU control dev. > (*) Changes related to the destruction of CPU Address Space. > (*) Changes related to the uninitialization of GDB Stub. > (*) Updating of Docs. > > (XII) REFERENCES > ================ > > [1] > https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.me...@huawei.com/ > [2] > https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.me...@huawei.com/ > [3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.mo...@arm.com/ > [4] https://lore.kernel.org/all/20230913163823.7880-1-james.mo...@arm.com/ > [5] > https://lore.kernel.org/all/20230404154050.2270077-1-oliver.up...@linux.dev/ > [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706 > [7] > https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure > [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5 > [9] > https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler > [10] > https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html > [11] https://lkml.org/lkml/2019/7/10/235 > [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html > [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html > [14] > https://op-lists.linaro.org/archives/list/linaro-open-discussi...@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/ > [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html > [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html > [17] > https://op-lists.linaro.org/archives/list/linaro-open-discussi...@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/ > [18] > https://lore.kernel.org/lkml/20210608154805.216869-1-jean-phili...@linaro.org/ > [19] https://lore.kernel.org/all/20230913163823.7880-1-james.mo...@arm.com/ > [20] > https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags > [21] > https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.me...@huawei.com/ > [22] > https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.me...@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7 > > (XIII) ACKNOWLEDGEMENTS > ======================= > > I would like to take this opportunity to thank below people for various > discussions with me over different channels during the development: > > Marc Zyngier (Google) Catalin Marinas (ARM), > James Morse(ARM), Will Deacon (Google), > Jean-Phillipe Brucker (Linaro), Sudeep Holla (ARM), > Lorenzo Pieralisi (Linaro), Gavin Shan (Redhat), > Jonathan Cameron (Huawei), Darren Hart (Ampere), > Igor Mamedov (Redhat), Ilkka Koskinen (Ampere), > Andrew Jones (Redhat), Karl Heubaum (Oracle), > Keqian Zhu (Huawei), Miguel Luis (Oracle), > Xiongfeng Wang (Huawei), Vishnu Pajjuri (Ampere), > Shameerali Kolothum (Huawei) Russell King (Oracle) > Xuwei/Joy (Huawei), Peter Maydel (Linaro) > Zengtao/Prime (Huawei), And all those whom I have missed! > > Many thanks to the following people for their current or past contributions: > > 1. James Morse (ARM) > (Current Kernel part of vCPU Hotplug Support on AARCH64) > 2. Jean-Philippe Brucker (Linaro) > (Prototyped one of the earlier PSCI-based POC [17][18] based on RFC V1) > 3. Keqian Zhu (Huawei) > (Co-developed Qemu prototype) > 4. Xiongfeng Wang (Huawei) > (Co-developed an earlier kernel prototype with me) > 5. Vishnu Pajjuri (Ampere) > (Verification on Ampere ARM64 Platforms + fixes) > 6. Miguel Luis (Oracle) > (Verification on Oracle ARM64 Platforms + fixes) > 7. Russell King (Oracle) & Jonathan Cameron (Huawei) > (Helping in upstreaming James Morse's Kernel patches). > > (XIV) Change Log: > ================= > > RFC V2 -> RFC V3: > ----------------- > 1. Miscellaneous: > - Split the RFC V2 into arch-agnostic and arch-specific patch sets. > 2. Addressed Gavin Shan's (RedHat) comments: > - Made CPU property accessors inline. > > https://lore.kernel.org/qemu-devel/6cd28639-2cfa-f233-c6d9-d5d2ec5b1...@redhat.com/ > - Collected Reviewed-bys [PATCH RFC V2 4/37, 14/37, 22/37]. > - Dropped the patch as it was not required after init logic was refactored. > > https://lore.kernel.org/qemu-devel/4fb2eef9-6742-1eeb-721a-b3db04b1b...@redhat.com/ > - Fixed the range check for the core during vCPU Plug. > > https://lore.kernel.org/qemu-devel/1c5fa24c-6bf3-750f-4f22-087e4a931...@redhat.com/ > - Added has_hotpluggable_vcpus check to make build_cpus_aml() conditional. > > https://lore.kernel.org/qemu-devel/832342cb-74bc-58dd-c5d7-6f995baeb...@redhat.com/ > - Fixed the states initialization in cpu_hotplug_hw_init() to accommodate > previous refactoring. > > https://lore.kernel.org/qemu-devel/da5e5609-1883-8650-c7d8-6868c7b74...@redhat.com/ > - Fixed typos. > > https://lore.kernel.org/qemu-devel/eb1ac571-7844-55e6-15e7-3dd7df213...@redhat.com/ > - Removed the unnecessary 'goto fail'. > > https://lore.kernel.org/qemu-devel/4d8980ac-f402-60d4-fe52-787815af8...@redhat.com/#t > - Added check for hotpluggable vCPUs in the _OSC method. > > https://lore.kernel.org/qemu-devel/20231017001326.FUBqQ1PTowF2GxQpnL3kIW0AhmSqbspazwixAHVSi6c@z/ > 3. Addressed Shaoqin Huang's (Intel) comments: > - Fixed the compilation break due to the absence of a call to > virt_cpu_properties() missing > along with its definition. > > https://lore.kernel.org/qemu-devel/3632ee24-47f7-ae68-8790-26eb2cf99...@redhat.com/ > 4. Addressed Jonathan Cameron's (Huawei) comments: > - Gated the 'disabled vcpu message' for GIC version < 3. > https://lore.kernel.org/qemu-devel/20240116155911.00004...@huawei.com/ > > RFC V1 -> RFC V2: > ----------------- > 1. Addressed James Morse's (ARM) requirement as per Linaro Open Discussion: > - Exposed all possible vCPUs as always ACPI _STA.present and available > during boot time. > - Added the _OSC handling as required by James's patches. > - Introduction of 'online-capable' bit handling in the flag of MADT GICC. > - SMCC Hypercall Exit handling in Qemu. > 2. Addressed Marc Zyngier's comment: > - Fixed the note about GIC CPU Interface in the cover letter. > 3. Addressed issues raised by Vishnu Pajjuru (Ampere) & Miguel Luis (Oracle) > during testing: > - Live/Pseudo Migration crashes. > 4. Others: > - Introduced the concept of persistent vCPU at QOM. > - Introduced wrapper APIs of present, possible, and persistent. > - Change at ACPI hotplug H/W init leg accommodating initializing is_present > and is_enabled states. > - Check to avoid unplugging cold-booted vCPUs. > - Disabled hotplugging with TCG/HVF/QTEST. > - Introduced CPU Topology, {socket, cluster, core, thread}-id property. > - Extract virt CPU properties as a common virt_vcpu_properties() function. > > Author Salil Mehta (1): > target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu > > Jean-Philippe Brucker (2): > hw/acpi: Make _MAT method optional > target/arm/kvm: Write CPU state back to KVM on reset > > Miguel Luis (1): > tcg/mttcg: enable threads to unregister in tcg_ctxs[] > > Salil Mehta (25): > arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id > property > cpu-common: Add common CPU utility for possible vCPUs > hw/arm/virt: Limit number of possible vCPUs for unsupported Accel or > GIC Type > hw/arm/virt: Move setting of common CPU properties in a function > arm/virt,target/arm: Machine init time change common to vCPU > {cold|hot}-plug > arm/virt,kvm: Pre-create disabled possible vCPUs @machine init > arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine > init > arm/virt: Init PMU at host for all possible vcpus > arm/acpi: Enable ACPI support for vcpu hotplug > arm/virt: Add cpu hotplug events to GED during creation > arm/virt: Create GED dev before *disabled* CPU Objs are destroyed > arm/virt/acpi: Build CPUs AML with CPU Hotplug support > arm/virt: Make ARM vCPU *present* status ACPI *persistent* > hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits > to Guest > hw/arm: MADT Tbl change to size the guest with possible vCPUs > arm/virt: Release objects for *disabled* possible vCPUs after init > arm/virt: Add/update basic hot-(un)plug framework > arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug > hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification > hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register > info > arm/virt: Update the guest(via GED) about CPU hot-(un)plug events > hw/arm: Changes required for reset and to support next boot > target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug > hw/arm: Support hotplug capability check using _OSC method > hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled > > accel/tcg/tcg-accel-ops-mttcg.c | 1 + > cpu-common.c | 37 ++ > hw/acpi/cpu.c | 62 +- > hw/acpi/generic_event_device.c | 11 + > hw/arm/Kconfig | 1 + > hw/arm/boot.c | 2 +- > hw/arm/virt-acpi-build.c | 113 +++- > hw/arm/virt.c | 877 +++++++++++++++++++++++------ > hw/core/gpio.c | 2 +- > hw/intc/arm_gicv3.c | 1 + > hw/intc/arm_gicv3_common.c | 66 ++- > hw/intc/arm_gicv3_cpuif.c | 269 +++++---- > hw/intc/arm_gicv3_cpuif_common.c | 5 + > hw/intc/arm_gicv3_kvm.c | 39 +- > hw/intc/gicv3_internal.h | 2 + > include/hw/acpi/cpu.h | 2 + > include/hw/arm/boot.h | 2 + > include/hw/arm/virt.h | 38 +- > include/hw/core/cpu.h | 78 +++ > include/hw/intc/arm_gicv3_common.h | 23 + > include/hw/qdev-core.h | 2 + > include/tcg/startup.h | 7 + > target/arm/arm-powerctl.c | 51 +- > target/arm/cpu-qom.h | 18 +- > target/arm/cpu.c | 112 ++++ > target/arm/cpu.h | 18 + > target/arm/cpu64.c | 15 + > target/arm/gdbstub.c | 6 + > target/arm/helper.c | 27 +- > target/arm/internals.h | 14 +- > target/arm/kvm.c | 146 ++++- > target/arm/kvm_arm.h | 25 + > target/arm/meson.build | 1 + > target/arm/{tcg => }/psci.c | 8 + > target/arm/tcg/meson.build | 4 - > tcg/tcg.c | 24 + > 36 files changed, 1749 insertions(+), 360 deletions(-) > rename target/arm/{tcg => }/psci.c (97%) > > -- > 2.34.1 >