Hi Miguel,

On Wed, Oct 16, 2024 at 3:09 PM Miguel Luis <miguel.l...@oracle.com> wrote:

> Hi Salil,
>
> > On 15 Oct 2024, at 09:59, Salil Mehta <salil.me...@huawei.com> wrote:
> >
> > PROLOGUE
> > ========
> >
> > To assist in review and set the right expectations from this RFC, please
> first
> > read the sections *APPENDED AT THE END* of this cover letter:
> >
> > 1. Important *DISCLAIMER* [Section (X)]
> > 2. Work presented at KVMForum Conference (slides available) [Section
> (V)F]
> > 3. Organization of patches [Section (XI)]
> > 4. References [Section (XII)]
> > 5. Detailed TODO list of leftover work or work-in-progress [Section (IX)]
> > 6. Repositories [Section (VII)]
> >
> > The architecture-agnostic patch set was merged into the mainline during
> the
> > last Qemu cycle. This patch is specific to the ARM architecture and is
> > compatible with the latest Qemu mainline version.
> >
> > SECTIONS [I - XIII] are as follows:
> >
> > (I) Summary of `Recent` Key Changes [details in last section (XIV)]
> > ===================================================================
> >
> > RFC V4 -> RFC V5
> >
> > 1. Dropped patches [PATCH RFC V4 {2,12,19}/33]
> > 2. Separated architecture agnostic ACPI/migration changes in separate
> patch-set.
> >   Link:
> https://lore.kernel.org/qemu-devel/20241014192205.253479-1-salil.me...@huawei.com/#t
> > 3. Dropped qemu{present,enabled}_cpu() APIs.
> > 4. Dropped the `CPUState::disabled` flag
> >
> > RFC V3 -> RFC V4
> >
> > 1. Fixes for TCG. It has been lightly tested but seem to work!
> > 2. Migration related fixes [Both KVM & TCG].
> > 3. Introduction of `gicc_accessble` flag for GICv3 CPU interface
> > 4. Addressed comments from Gavin Shan (RedHat), Nicholas Piggin (IBM),
> >   Alex Bennée's & Gustavo Romero (Linaro)
> > 5. Misc fixes and refatoring.
> >
> >
> > (II) Summary
> > ============
> >
> > This patch set introduces virtual CPU hotplug support for the ARMv8
> architecture
> > in QEMU. The idea is to be able to hotplug and hot-unplug vCPUs while
> the guest
> > VM is running, without requiring a reboot. This does *not* make any
> assumptions
> > about the physical CPU hotplug availability within the host system but
> rather
> > tries to solve the problem at the virtualizer/QEMU layer. It introduces
> ACPI CPU
> > hotplug hooks and event handling to interface with the guest kernel, and
> code to
> > initialize, plug, and unplug CPUs. No changes are required within the
> host
> > kernel/KVM except the support of hypercall exit handling in the
> user-space/Qemu,
> > which has recently been added to the kernel. Corresponding guest kernel
> changes
> > were posted on the mailing list [3] [4] by James Morse (ARM) and have
> been
> > recently accepted and are now part of v6.11 mainline kernel.
> >
> > (III) Motivation
> > ================
> >
> > This allows scaling the guest VM compute capacity on-demand, which would
> be
> > useful for the following example scenarios:
> >
> > 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the
> orchestration
> >   framework that could adjust resource requests (CPU and Mem requests)
> for
> >   the containers in a pod, based on usage.
> > 2. Pay-as-you-grow Business Model: Infrastructure providers could
> allocate and
> >   restrict the total number of compute resources available to the guest
> VM
> >   according to the SLA (Service Level Agreement). VM owners could
> request more
> >   compute to be hot-plugged for some cost.
> >
> > For example, Kata Container VM starts with a minimum amount of resources
> (i.e.,
> > hotplug everything approach). Why?
> >
> > 1. Allowing faster *boot time* and
> > 2. Reduction in *memory footprint*
> >
> > Kata Container VM can boot with just 1 vCPU, and then later more vCPUs
> can be
> > hot-plugged as needed. Reducing the number of vCPUs in VM can in general
> > reduce the boot times of the VM esepcially when number of cores are
> increasing.
> >
> > **[UPCOMING]**
> > I've been working on enhancing the boot times of ARM/VMs using the
> hotplug
> > infrastructure proposed in this patch set. Stay tuned for upcoming
> patches that
> > leverage this infrastructure to significantly reduce boot times for
> > *non-hotplug* scenarios. Expect these updates in the next few weeks!
> >
> > (IV) Terminology
> > ================
> >
> > (*) Possible CPUs: Total vCPUs that could ever exist in the VM. This
> includes
> >                   any cold-booted CPUs plus any CPUs that could be later
> >                   hot-plugged.
> >                   - Qemu parameter (-smp maxcpus=N)
> > (*) Present CPUs:  Possible CPUs that are ACPI 'present'. These might or
> might
> >                   not be ACPI 'enabled'.
> >                   - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> > (*) Enabled CPUs:  Possible CPUs that are ACPI 'present' and 'enabled'
> and can
> >                   now be ‘onlined’ (PSCI) for use by the Guest Kernel.
> All cold-
> >                   booted vCPUs are ACPI 'enabled' at boot. Later, using
> >                   device_add, more vCPUs can be hotplugged and made ACPI
> >                   'enabled'.
> >                   - Qemu parameter (-smp cpus=N). Can be used to specify
> some
> >   cold-booted vCPUs during VM init. Some can be added using the
> >   '-device' option.
> >
> > (V) Constraints Due to ARMv8 CPU Architecture [+] Other Impediments
> > ===================================================================
> >
> > A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
> >   1. ARMv8 CPU architecture does not support the concept of the physical
> CPU
> >      hotplug.
> >      a. There are many per-CPU components like PMU, SVE, MTE, Arch
> timers, etc.,
> >         whose behavior needs to be clearly defined when the CPU is
> > hot(un)plugged. There is no specification for this.
> >
> >   2. Other ARM components like GIC, etc., have not been designed to
> realize
> >      physical CPU hotplug capability as of now. For example,
> >      a. Every physical CPU has a unique GICC (GIC CPU Interface) by
> construct.
> >         Architecture does not specify what CPU hot(un)plug would mean in
> >         context to any of these.
> >      b. CPUs/GICC are physically connected to unique GICR (GIC
> Redistributor).
> >         GIC Redistributors are always part of the always-on power
> domain. Hence,
> >         they cannot be powered off as per specification.
> >
> > B. Impediments in Firmware/ACPI (Architectural Constraint)
> >
> >   1. Firmware has to expose GICC, GICR, and other per-CPU features like
> PMU,
> >      SVE, MTE, Arch Timers, etc., to the OS. Due to the architectural
> constraint
> >      stated in section A1(a), all interrupt controller structures of
> >      MADT describing GIC CPU Interfaces and the GIC Redistributors MUST
> be
> >      presented by firmware to the OSPM during boot time.
> >   2. Architectures that support CPU hotplug can evaluate the ACPI _MAT
> method to
> >      get this kind of information from the firmware even after boot, and
> the
> >      OSPM has the capability to process these. ARM kernel uses
> information in MADT
> >      interrupt controller structures to identify the number of present
> CPUs during
> >      boot and hence does not allow to change these after boot. The
> number of
> >      present CPUs cannot be changed. It is an architectural constraint!
> >
> > C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural
> Constraint)
> >
> >   1. KVM VGIC:
> >      a. Sizing of various VGIC resources like memory regions, etc.,
> related to
> >         the redistributor happens only once and is fixed at the VM init
> time
> >         and cannot be changed later after initialization has happened.
> >         KVM statically configures these resources based on the number of
> vCPUs
> >         and the number/size of redistributor ranges.
> >      b. Association between vCPU and its VGIC redistributor is fixed at
> the
> >         VM init time within the KVM, i.e., when redistributor iodevs gets
> >         registered. VGIC does not allow to setup/change this association
> >         after VM initialization has happened. Physically, every CPU/GICC
> is
> >         uniquely connected with its redistributor, and there is no
> >         architectural way to set this up.
> >   2. KVM vCPUs:
> >      a. Lack of specification means destruction of KVM vCPUs does not
> exist as
> >         there is no reference to tell what to do with other per-vCPU
> >         components like redistributors, arch timer, etc.
> >      b. In fact, KVM does not implement the destruction of vCPUs for any
> >         architecture. This is independent of whether the architecture
> >         actually supports CPU Hotplug feature. For example, even for x86
> KVM
> >         does not implement the destruction of vCPUs.
> >
> > D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM
> Constraints->Arch)
> >
> >   1. Qemu CPU Objects MUST be created to initialize all the Host KVM
> vCPUs to
> >      overcome the KVM constraint. KVM vCPUs are created and initialized
> when Qemu
> >      CPU Objects are realized. But keeping the QOM CPU objects realized
> for
> >      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs
> shall
> >      be plugged using device_add and a new QOM CPU object shall be
> created.
> >   2. GICV3State and GICV3CPUState objects MUST be sized over *possible
> vCPUs*
> >      during VM init time while QOM GICV3 Object is realized. This is
> because
> >      KVM VGIC can only be initialized once during init time. But every
> >      GICV3CPUState has an associated QOM CPU Object. Later might
> correspond to
> >      vCPU which are 'yet-to-be-plugged' (unplugged at init).
> >   3. How should new QOM CPU objects be connected back to the
> GICV3CPUState
> >      objects and disconnected from it in case the CPU is being
> hot(un)plugged?
> >   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented
> in the
> >      QOM for which KVM vCPU already exists? For example, whether to keep,
> >       a. No QOM CPU objects Or
> >       b. Unrealized CPU Objects
> >   5. How should vCPU state be exposed via ACPI to the Guest? Especially
> for
> >      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not
> exist
> >      within the QOM but the Guest always expects all possible vCPUs to be
> >      identified as ACPI *present* during boot.
> >   6. How should Qemu expose GIC CPU interfaces for the unplugged or
> >      yet-to-be-plugged vCPUs using ACPI MADT Table to the Guest?
> >
> > E. Summary of Approach ([+] Workarounds to problems in sections A, B, C
> & D)
> >
> >   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e.,
> even
> >      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in
> the
> >      powered-off state.
> >   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
> >      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are
> parked
> >      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to
> x86)
> >   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs
> during
> >      VM init time i.e., when Qemu GIC is realized. This, in turn, sizes
> KVM VGIC
> >      resources like memory regions, etc., related to the redistributors
> with the
> >      number of possible KVM vCPUs. This never changes after VM has
> initialized.
> >   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs
> are
> >      released post Host KVM CPU and GIC/VGIC initialization.
> >   5. Build ACPI MADT Table with the following updates:
> >      a. Number of GIC CPU interface entries (=possible vCPUs)
> >      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
> >      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
> >         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> > - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy)
> > - Some issues with above (details in later sections)
> >   6. Expose below ACPI Status to Guest kernel:
> >      a. Always _STA.Present=1 (all possible vCPUs)
> >      b. _STA.Enabled=1 (plugged vCPUs)
> >      c. _STA.Enabled=0 (unplugged vCPUs)
> >   7. vCPU hotplug *realizes* new QOM CPU object. The following happens:
> >      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread.
> >      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list).
> >         - Attaches to QOM CPU object.
> >      c. Reinitializes KVM vCPU in the Host.
> >         - Resets the core and sys regs, sets defaults, etc.
> >      d. Runs KVM vCPU (created with "start-powered-off").
> > - vCPU thread sleeps (waits for vCPU reset via PSCI).
> >      e. Updates Qemu GIC.
> >         - Wires back IRQs related to this vCPU.
> >         - GICV3CPUState association with QOM CPU Object.
> >      f. Updates [6] ACPI _STA.Enabled=1.
> >      g. Notifies Guest about the new vCPU (via ACPI GED interface).
> > - Guest checks _STA.Enabled=1.
> > - Guest adds processor (registers CPU with LDM) [3].
> >      h. Plugs the QOM CPU object in the slot.
> >         - slot-number = cpu-index {socket, cluster, core, thread}.
> >      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC).
> >         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
> >         - Qemu powers-on KVM vCPU in the Host.
> >   8. vCPU hot-unplug *unrealizes* QOM CPU Object. The following happens:
> >      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event.
> >         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC).
> >      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
> >         - Qemu powers-off the KVM vCPU in the Host.
> >      c. Guest signals *Eject* vCPU to Qemu.
> >      d. Qemu updates [6] ACPI _STA.Enabled=0.
> >      e. Updates GIC.
> >         - Un-wires IRQs related to this vCPU.
> >         - GICV3CPUState association with new QOM CPU Object is updated.
> >      f. Unplugs the vCPU.
> > - Removes from slot.
> >         - Parks KVM vCPU ("kvm_parked_vcpus" list).
> >         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread.
> > - Destroys QOM CPU object.
> >      g. Guest checks ACPI _STA.Enabled=0.
> >         - Removes processor (unregisters CPU with LDM) [3].
> >
> > F. Work Presented at KVM Forum Conferences:
> > ==========================================
> >
> > Details of the above work have been presented at KVMForum2020 and
> KVMForum2023
> > conferences. Slides & video are available at the links below:
> > a. KVMForum 2023
> >   - Challenges Revisited in Supporting Virt CPU Hotplug on architectures
> that don't Support CPU Hotplug (like ARM64).
> >     https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
> >
> https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
> >     https://www.youtube.com/watch?v=hyrw4j2D6I0&t=23970s
> >     https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
> > b. KVMForum 2020
> >   - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems
> (like ARM64) - Salil Mehta, Huawei.
> >     https://kvmforum2020.sched.com/event/eE4m
> >
> > (VI) Commands Used
> > ==================
> >
> > A. Qemu launch commands to init the machine:
> >
> > $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
> > -cpu host -smp cpus=4,maxcpus=6 \
> > -m 300M \
> > -kernel Image \
> > -initrd rootfs.cpio.gz \
> > -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
> acpi=force" \
> > -nographic \
> > -bios QEMU_EFI.fd \
> >
> > B. Hot-(un)plug related commands:
> >
> > # Hotplug a host vCPU (accel=kvm):
> > $ device_add host-arm-cpu,id=core4,core-id=4
> >
> > # Hotplug a vCPU (accel=tcg):
> > $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> >
> > # Delete the vCPU:
> > $ device_del core4
> >
> > Sample output on guest after boot:
> >
> >    $ cat /sys/devices/system/cpu/possible
> >    0-5
> >    $ cat /sys/devices/system/cpu/present
> >    0-5
> >    $ cat /sys/devices/system/cpu/enabled
> >    0-3
> >    $ cat /sys/devices/system/cpu/online
> >    0-1
> >    $ cat /sys/devices/system/cpu/offline
> >    2-
> >
> > Sample output on guest after hotplug of vCPU=4:
> >
> >    $ cat /sys/devices/system/cpu/possible
> >    0-5
> >    $ cat /sys/devices/system/cpu/present
> >    0-5
> >    $ cat /sys/devices/system/cpu/enabled
> >    0-4
> >    $ cat /sys/devices/system/cpu/online
> >    0-1,4
> >    $ cat /sys/devices/system/cpu/offline
> >    2-3,5
> >
> >    Note: vCPU=4 was explicitly 'onlined' after hot-plug
> >    $ echo 1 > /sys/devices/system/cpu/cpu4/online
> >
> > (VII) Latest Repository
> > =======================
> >
> > (*) Latest Qemu RFC V5 (Architecture Specific) patch set:
> >    https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v5
> > (*) Latest Architecture Agnostic ACPI changes patch-set:
> >
> https://lore.kernel.org/qemu-devel/20241014192205.253479-1-salil.me...@huawei.com/#t
> > (*) Older QEMU changes for vCPU hotplug can be cloned from below site:
> >    https://github.com/salil-mehta/qemu.git
> virt-cpuhp-armv8/rfc-{v1,v2,v3,v4}
> > (*) `Accepted` Qemu Architecture Agnostic patch is present here:
> >
> https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-v3.arch.agnostic.v16/
> > (*) All Kernel changes are already part of mainline v6.11
> > (*) Original Guest Kernel changes (by James Morse, ARM) are available
> here:
> >    https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> virtual_cpu_hotplug/rfc/v2
> >
> >
> > (VIII) KNOWN ISSUES
> > ===================
> >
> > 1. HVF and qtest are not supported yet.
> > 2. ACPI MADT Table flags [7] MADT.GICC.Enabled and
> MADT.GICC.online-capable are
> >   mutually exclusive, i.e., as per the change [6], a vCPU cannot be both
> >   GICC.Enabled and GICC.online-capable. This means:
> >      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
> >   a. If we have to support hot-unplug of the cold-booted vCPUs, then
> these MUST
> >      be specified as GICC.online-capable in the MADT Table during boot
> by the
> >      firmware/Qemu. But this requirement conflicts with the requirement
> to
> >      support new Qemu changes with legacy OS that don't understand
> >      MADT.GICC.online-capable Bit. Legacy OS during boot time will
> ignore this
> >      bit, and hence these vCPUs will not appear on such OS. This is
> unexpected
> >      behavior.
> >   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to
> unplug
> >      these cold-booted vCPUs from OS (which in actuality should be
> blocked by
> >      returning error at Qemu), then features like 'kexec' will break.
> >   c. As I understand, removal of the cold-booted vCPUs is a required
> feature
> >      and x86 world allows it.
> >   d. Hence, either we need a specification change to make the
> MADT.GICC.Enabled
> >      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT
> support
> >      the removal of cold-booted vCPUs. In the latter case, a check can
> be introduced
> >      to bar the users from unplugging vCPUs, which were cold-booted,
> using QMP
> >      commands. (Needs discussion!)
> >      Please check the patch part of this patch set:
> >      [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled].
> >
> >      NOTE: This is definitely not a blocker!
> >
> >
> > (IX) THINGS TO DO
> > =================
> >
> > 1. TCG is now in working state but would need extensive testing to roll
> out
> >   any corner cases. Any help related to this will be appreciated!
> > 2. Comprehensive Testing is in progress. (Positive feedback from Oracle
> & Ampere)
> > 3. Qemu Documentation (.rst) needs to be updated.
> > 4. The `is_enabled` and `is_present` ACPI states are now common to all
> architectures
> >   and should work seemlessely but needs thorough testing with other
> architectures.
> > 5. Migration has been lightly tested but has been found working.
> > 6. A missing check for PMU state for the hotplugged vCPUs (Reported by:
> Gavin Shan)
> >
> https://lore.kernel.org/qemu-devel/28f3107f-0267-4112-b0ca-da59df296...@redhat.com/
> > 7. Need help in Testing with ARM hardware extensions like SVE/SME
> >
> >
> >
> > Best regards,
> > Salil.
> >
> > (X) DISCLAIMER
> > ==============
> >
> > This work is an attempt to present a proof-of-concept of the ARM64 vCPU
> hotplug
> > implementation to the community. This is *not* production-level code and
> might
> > have bugs. Comprehensive testing is being done on HiSilicon Kunpeng920
> SoC,
> > Oracle, and Ampere servers. We are nearing stable code and a non-RFC
> > version shall be floated soon.
> >
> > This work is *mostly* in the lines of the discussions that have happened
> in the
> > previous years [see refs below] across different channels like the
> mailing list,
> > Linaro Open Discussions platform, and various conferences like KVMForum,
> etc. This
> > RFC is being used as a way to verify the idea mentioned in this cover
> letter and
> > to get community views. Once this has been agreed upon, a formal patch
> shall be
> > posted to the mailing list for review.
> >
> > [The concept being presented has been found to work!]
> >
> > (XI) ORGANIZATION OF PATCHES
> > ============================
> >
> > A. ARM Architecture *specific* patches:
> >
> >   [Patch 1-8, 11, 12, 30] logic required during machine init.
> >    (*) Some validation checks.
> >    (*) Introduces core-id property and some util functions required
> later.
> >    (*) Logic to pre-create vCPUs.
> >    (*) Introduction of the GICv3 CPU Interface `accessibility` interface
> >    (*) GIC initialization pre-sized with possible vCPUs.
> >    (*) Some refactoring to have common hot and cold plug logic together.
> >    (*) Release of disabled QOM CPU objects in post_cpu_init().
> >   [Patch 9-10, 13-15] logic related to ACPI at machine init time.
> >    (*) Changes required to Enable ACPI for CPU hotplug.
> >    (*) Initialization of ACPI GED framework to cater to CPU Hotplug
> Events.
> >    (*) ACPI DSDT, MADT/MAT changes.
> >   [Patch 17-29] logic required during vCPU hot-(un)plug.
> >    (*) Basic framework changes to support vCPU hot-(un)plug.
> >    (*) ACPI GED changes for hot-(un)plug hooks.
> >    (*) Wire-unwire the IRQs.
> >    (*) GIC notification logic on receiving vCPU hot(un)plug event.
> >    (*) ARMCPU unrealize logic.
> >    (*) Handling of SMCC Hypercall Exits by KVM to Qemu.
> >   [Patch 33] Disable unplug of cold-booted vCPUs
> >
> >
> > (XII) REFERENCES
> > ================
> >
> > [1]
> https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.me...@huawei.com/
> > [2]
> https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.me...@huawei.com/
> > [3]
> https://lore.kernel.org/lkml/20230203135043.409192-1-james.mo...@arm.com/
> > [4]
> https://lore.kernel.org/all/20230913163823.7880-1-james.mo...@arm.com/
> > [5]
> https://lore.kernel.org/all/20230404154050.2270077-1-oliver.up...@linux.dev/
> > [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> > [7]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> > [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> > [9]
> https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
> > [10]
> https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
> > [11] https://lkml.org/lkml/2019/7/10/235
> > [12]
> https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> > [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> > [14]
> https://op-lists.linaro.org/archives/list/linaro-open-discussi...@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> > [15]
> http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> > [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> > [17]
> https://op-lists.linaro.org/archives/list/linaro-open-discussi...@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> > [18]
> https://lore.kernel.org/lkml/20210608154805.216869-1-jean-phili...@linaro.org/
> > [19]
> https://lore.kernel.org/all/20230913163823.7880-1-james.mo...@arm.com/
> > [20]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
> > [21]
> https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.me...@huawei.com/
> > [22]
> https://lore.kernel.org/qemu-devel/20240607115649.214622-1-salil.me...@huawei.com/T/#md0887eb07976bc76606a8204614ccc7d9a01c1f7
> > [23] RFC V3:
> https://lore.kernel.org/qemu-devel/20240613233639.202896-1-salil.me...@huawei.com/#t
> >
> > (XIII) ACKNOWLEDGEMENTS
> > =======================
> >
> > I would like to thank the following people for various discussions with
> me over different channels during development:
> >
> > Marc Zyngier (Google), Catalin Marinas (ARM), James Morse (ARM), Will
> Deacon (Google),
> > Jean-Philippe Brucker (Linaro), Sudeep Holla (ARM), Lorenzo Pieralisi
> (Linaro),
> > Gavin Shan (RedHat), Jonathan Cameron (Huawei), Darren Hart (Ampere),
> > Igor Mamedov (RedHat), Ilkka Koskinen (Ampere), Andrew Jones (RedHat),
> > Karl Heubaum (Oracle), Keqian Zhu (Huawei), Miguel Luis (Oracle),
> > Xiongfeng Wang (Huawei), Vishnu Pajjuri (Ampere), Shameerali Kolothum
> (Huawei),
> > Russell King (Oracle), Xuwei/Joy (Huawei), Peter Maydel (Linaro),
> > Zengtao/Prime (Huawei), Nicholas Piggin (IBM) and all those whom I have
> missed!
> >
> > Many thanks to the following people for their current or past
> contributions:
> >
> > 1. James Morse (ARM)
> >   (Current Kernel part of vCPU Hotplug Support on AARCH64)
> > 2. Jean-Philippe Brucker (Linaro)
> >   (Prototyped one of the earlier PSCI-based POC [17][18] based on RFC V1)
> > 3. Keqian Zhu (Huawei)
> >   (Co-developed Qemu prototype)
> > 4. Xiongfeng Wang (Huawei)
> >   (Co-developed an earlier kernel prototype with me)
> > 5. Vishnu Pajjuri (Ampere)
> >   (Verification on Ampere ARM64 Platforms + fixes)
> > 6. Miguel Luis (Oracle)
> >   (Verification on Oracle ARM64 Platforms + fixes)
> > 7. Russell King (Oracle) & Jonathan Cameron (Huawei)
> >   (Helping in upstreaming James Morse's Kernel patches).
> >
> > (XIV) Change Log:
> > =================
> >
> > RFC V4 -> RFC V5:
> > -----------------
> > 1. Dropped "[PATCH RFC V4 19/33] target/arm: Force ARM vCPU *present*
> status ACPI *persistent*"
> >   - Seperated the architecture agnostic ACPI changes required to support
> vCPU Hotplug
> >     Link:
> https://lore.kernel.org/qemu-devel/20241014192205.253479-1-salil.me...@huawei.com/#t
> > 2. Dropped "[PATCH RFC V4 02/33] cpu-common: Add common CPU utility for
> possible vCPUs"
> >   - Dropped qemu{present,enabled}_cpu() APIs. Commented by Gavin
> (Redhat), Miguel(Oracle), Igor(Redhat)
> > 3. Added "Reviewed-by: Miguel Luis <miguel.l...@oracle.com>" to [PATCH
> RFC V4 01/33]
> > 3. Dropped the `CPUState::disabled` flag and introduced
> `GICv3State::num_smp_cpus` flag
> >   - All `GICv3CPUState' between [num_smp_cpus,num_cpus) are marked as
> 'inaccessible` during gicv3_common_realize()
> >   - qemu_enabled_cpu() not required - removed!
> >   - removed usage of `CPUState::disabled` from virt.c and hw/cpu64.c
> > 4. Removed virt_cpu_properties() and introduced property `mp-affinity`
> get accessor
> > 5. Dropped "[PATCH RFC V4 12/33] arm/virt: Create GED device before
> *disabled* vCPU Objects are destroyed"
> >
>
> I've tested this series on the following configurations for both KVM and
> TCG:
>
> -M virt -accel kvm -cpu host
> -M virt,gic_version=3 -accel kvm -cpu host
> -M virt,gic_version=3 -accel tcg -cpu max
>
> They boot, QEMU is able to hotplug and unplug vCPUs and successive live
> migrations work as expected.
>
> For migrations where the destination VM exceeds the number of vCPUs
> enabled on
> the source VM (patch 26/30) QEMU shows the expected warnings, also for
> both KVM
> and TCG:
>
> qemu-system-aarch64: warning: Found CPU 2 enabled, for incoming *disabled*
> GICC State
> qemu-system-aarch64: warning: *Disabling* CPU 2, to match the incoming
> migrated state
>


Thanks for confirming this.



>
> The following configuration of:
>
> -M virt -accel tcg -cpu max
>


Yep, perhaps that's something I missed while testing. Thanks for
identifying.


>
> shows defects in which I had to use the below diff to proceed to boot:
>
> diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
> index 53fb2c4e2d..f5ad33093e 100644
> --- a/hw/intc/arm_gic_common.c
> +++ b/hw/intc/arm_gic_common.c
> @@ -350,6 +350,7 @@ static void arm_gic_common_linux_init(ARMLinuxBootIf
> *obj,
>
>  static Property arm_gic_common_properties[] = {
>      DEFINE_PROP_UINT32("num-cpu", GICState, num_cpu, 1),
> +    DEFINE_PROP_UINT32("num-smp-cpu", GICState, num_smp_cpu, 1),
>      DEFINE_PROP_UINT32("num-irq", GICState, num_irq, 32),
>      /* Revision can be 1 or 2 for GIC architecture specification
>       * versions 1 or 2, or 0 to indicate the legacy 11MPCore GIC.
> diff --git a/include/hw/intc/arm_gic_common.h
> b/include/hw/intc/arm_gic_common.h
> index 97fea4102d..a57f20798a 100644
> --- a/include/hw/intc/arm_gic_common.h
> +++ b/include/hw/intc/arm_gic_common.h
> @@ -130,6 +130,8 @@ struct GICState {
>
>      uint32_t num_cpu;
>
> +    uint32_t num_smp_cpu;
> +
>      MemoryRegion iomem; /* Distributor */
>      /* This is just so we can have an opaque pointer which identifies
>       * both this GIC and which CPU interface we should be accessing.
>
> And lastly, one more issue I’ve noticed with this configuration is adding
> a new vcpu
> with -device also needs fixing as is breaking the assert:
>
> `cpu_id >= 0 && cpu_id < ms->possible_cpus->len`
>
> in virt_get_cpu_id_from_cpu_topo.
>


sure. Thanks.



>
>
> Thank you
> Miguel
>
>
> > RFC V3 -> RFC V4:
> > -----------------
> > 1. Addressed Nicholas Piggin's (IBM) comments
> >   - Moved qemu_get_cpu_archid() as a ACPI helper inline acpi/cpu.h
> >
> https://lore.kernel.org/qemu-devel/d2gfclh11hgj.1ijganhq9z...@gmail.com/
> >   - Introduced new macro CPU_FOREACH_POSSIBLE() in [PATCH 12/33]
> >
> https://lore.kernel.org/qemu-devel/d2gf9a9ajo02.1g1g8uexa5...@gmail.com/
> >   - Converted CPUState::acpi_persistent into Property. Improved the
> cover note
> >
> https://lore.kernel.org/qemu-devel/d2h62rk48kt7.2btqezuoeg...@gmail.com/
> >   - Fixed teh cover note of the[PATCH ] and clearly mentioned about
> KVMParking
> >
> https://lore.kernel.org/qemu-devel/d2gfogqc3hyo.2lkov306ji...@gmail.com/
> > 2. Addressed Gavin Shan's (RedHat) comments:
> >   - Introduced the ARM Extensions check. [Looks like I missed the PMU
> check :( ]
> >
> https://lore.kernel.org/qemu-devel/28f3107f-0267-4112-b0ca-da59df296...@redhat.com/
> >   - Moved create_gpio() along with create_ged()
> >
> https://lore.kernel.org/qemu-devel/143ad7d2-8f45-4428-bed3-891203a49...@redhat.com/
> >   - Improved the logic of the GIC creation and initialization
> >
> https://lore.kernel.org/qemu-devel/9b7582f0-8149-4bf0-a1aa-4d4fe0d35...@redhat.com/
> >   - Removed redundant !dev->realized checks in cpu_hotunplug(_request)
> >
> https://lore.kernel.org/qemu-devel/64e9feaa-8df2-4108-9e73-c72517fb0...@redhat.com/
> > 3. Addresses Alex Bennée's + Gustavo Romero (Linaro) comments
> >   - Fixed the TCG support and now it works for all the cases including
> migration.
> >     https://lore.kernel.org/qemu-devel/87bk1b3azm....@draig.linaro.org/
> >   - Fixed the cpu_address_space_destroy() compilation failuer in
> user-mode
> >     https://lore.kernel.org/qemu-devel/87v800wkb1....@draig.linaro.org/
> > 4. Fixed crash in .post_gicv3() during migration with asymmetrically
> *enabled*
> >     vCPUs at destination VM
> >
> > RFC V2 -> RFC V3:
> > -----------------
> > 1. Miscellaneous:
> >   - Split the RFC V2 into arch-agnostic and arch-specific patch sets.
> > 2. Addressed Gavin Shan's (RedHat) comments:
> >   - Made CPU property accessors inline.
> >
> https://lore.kernel.org/qemu-devel/6cd28639-2cfa-f233-c6d9-d5d2ec5b1...@redhat.com/
> >   - Collected Reviewed-bys [PATCH RFC V2 4/37, 14/37, 22/37].
> >   - Dropped the patch as it was not required after init logic was
> refactored.
> >
> https://lore.kernel.org/qemu-devel/4fb2eef9-6742-1eeb-721a-b3db04b1b...@redhat.com/
> >   - Fixed the range check for the core during vCPU Plug.
> >
> https://lore.kernel.org/qemu-devel/1c5fa24c-6bf3-750f-4f22-087e4a931...@redhat.com/
> >   - Added has_hotpluggable_vcpus check to make build_cpus_aml()
> conditional.
> >
> https://lore.kernel.org/qemu-devel/832342cb-74bc-58dd-c5d7-6f995baeb...@redhat.com/
> >   - Fixed the states initialization in cpu_hotplug_hw_init() to
> accommodate previous refactoring.
> >
> https://lore.kernel.org/qemu-devel/da5e5609-1883-8650-c7d8-6868c7b74...@redhat.com/
> >   - Fixed typos.
> >
> https://lore.kernel.org/qemu-devel/eb1ac571-7844-55e6-15e7-3dd7df213...@redhat.com/
> >   - Removed the unnecessary 'goto fail'.
> >
> https://lore.kernel.org/qemu-devel/4d8980ac-f402-60d4-fe52-787815af8...@redhat.com/#t
> >   - Added check for hotpluggable vCPUs in the _OSC method.
> >
> https://lore.kernel.org/qemu-devel/20231017001326.FUBqQ1PTowF2GxQpnL3kIW0AhmSqbspazwixAHVSi6c@z/
> > 3. Addressed Shaoqin Huang's (Intel) comments:
> >   - Fixed the compilation break due to the absence of a call to
> virt_cpu_properties() missing
> >     along with its definition.
> >
> https://lore.kernel.org/qemu-devel/3632ee24-47f7-ae68-8790-26eb2cf99...@redhat.com/
> > 4. Addressed Jonathan Cameron's (Huawei) comments:
> >   - Gated the 'disabled vcpu message' for GIC version < 3.
> >
> https://lore.kernel.org/qemu-devel/20240116155911.00004...@huawei.com/
> >
> > RFC V1 -> RFC V2:
> > -----------------
> > 1. Addressed James Morse's (ARM) requirement as per Linaro Open
> Discussion:
> >   - Exposed all possible vCPUs as always ACPI _STA.present and available
> during boot time.
> >   - Added the _OSC handling as required by James's patches.
> >   - Introduction of 'online-capable' bit handling in the flag of MADT
> GICC.
> >   - SMCC Hypercall Exit handling in Qemu.
> > 2. Addressed Marc Zyngier's comment:
> >   - Fixed the note about GIC CPU Interface in the cover letter.
> > 3. Addressed issues raised by Vishnu Pajjuru (Ampere) & Miguel Luis
> (Oracle) during testing:
> >   - Live/Pseudo Migration crashes.
> > 4. Others:
> >   - Introduced the concept of persistent vCPU at QOM.
> >   - Introduced wrapper APIs of present, possible, and persistent.
> >   - Change at ACPI hotplug H/W init leg accommodating initializing
> is_present and is_enabled states.
> >   - Check to avoid unplugging cold-booted vCPUs.
> >   - Disabled hotplugging with TCG/HVF/QTEST.
> >   - Introduced CPU Topology, {socket, cluster, core, thread}-id property.
> >   - Extract virt CPU properties as a common virt_vcpu_properties()
> function.
> >
> > Author Salil Mehta (1):
> >  target/arm/kvm,tcg: Handle SMCCC hypercall exits in VMM during
> >    PSCI_CPU_{ON,OFF}
> >
> > Jean-Philippe Brucker (2):
> >  hw/acpi: Make _MAT method optional
> >  target/arm/kvm: Write vCPU's state back to KVM on cold-reset
> >
> > Miguel Luis (1):
> >  tcg/mttcg: Introduce MTTCG thread unregistration leg
> >
> > Salil Mehta (26):
> >  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
> >    property
> >  hw/arm/virt: Disable vCPU hotplug for *unsupported* Accel or GIC Type
> >  hw/arm/virt: Move setting of common vCPU properties in a function
> >  arm/virt,target/arm: Machine init time change common to vCPU
> >    {cold|hot}-plug
> >  arm/virt,kvm: Pre-create KVM vCPUs for all unplugged QOM vCPUs
> >    @machine init
> >  arm/virt,gicv3: Changes to pre-size GIC with possible vCPUs @machine
> >    init
> >  arm/virt,gicv3: Introduce GICv3 CPU Interface *accessibility* flag and
> >    checks
> >  hw/intc/arm-gicv3*: Changes required to (re)init the GICv3 vCPU
> >    Interface
> >  arm/acpi: Enable ACPI support for vCPU hotplug
> >  arm/virt: Enhance GED framework to handle vCPU hotplug events
> >  arm/virt: Init PMU at host for all possible vCPUs
> >  arm/virt: Release objects for *disabled* possible vCPUs after init
> >  arm/virt/acpi: Update ACPI DSDT Tbl to include CPUs AML with hotplug
> >    support
> >  hw/arm/acpi: MADT Tbl change to size the guest with possible vCPUs
> >  target/arm: Force ARM vCPU *present* status ACPI *persistent*
> >  arm/virt: Add/update basic hot-(un)plug framework
> >  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
> >  hw/arm,gicv3: Changes to notify GICv3 CPU state with vCPU hot-(un)plug
> >    event
> >  hw/arm: Changes required for reset and to support next boot
> >  arm/virt: Update the guest(via GED) about vCPU hot-(un)plug events
> >  target/arm/cpu: Check if hotplugged ARM vCPU's FEAT match existing
> >  tcg: Update tcg_register_thread() leg to handle region alloc for
> >    hotplugged vCPU
> >  target/arm: Add support to *unrealize* ARMCPU during vCPU Hot-unplug
> >  hw/intc/arm_gicv3_common: Add GICv3CPUState 'accessible' flag
> >    migration handling
> >  hw/intc/arm_gicv3_kvm: Pause all vCPU to ensure locking in KVM of
> >    resetting vCPU
> >  hw/arm/virt: Expose cold-booted vCPUs as MADT GICC *Enabled*
> >
> > accel/tcg/tcg-accel-ops-mttcg.c    |   3 +-
> > accel/tcg/tcg-accel-ops-rr.c       |   2 +-
> > cpu-common.c                       |  11 +
> > hw/acpi/cpu.c                      |   9 +-
> > hw/arm/Kconfig                     |   1 +
> > hw/arm/boot.c                      |   2 +-
> > hw/arm/virt-acpi-build.c           |  69 ++-
> > hw/arm/virt.c                      | 840 +++++++++++++++++++++++------
> > hw/core/gpio.c                     |   2 +-
> > hw/intc/arm_gicv3.c                |   1 +
> > hw/intc/arm_gicv3_common.c         |  99 +++-
> > hw/intc/arm_gicv3_cpuif.c          | 253 ++++-----
> > hw/intc/arm_gicv3_cpuif_common.c   |  13 +
> > hw/intc/arm_gicv3_kvm.c            |  40 +-
> > hw/intc/gicv3_internal.h           |   1 +
> > include/hw/acpi/cpu.h              |  19 +
> > include/hw/arm/boot.h              |   2 +
> > include/hw/arm/virt.h              |  64 ++-
> > include/hw/core/cpu.h              |  20 +
> > include/hw/intc/arm_gicv3_common.h |  61 +++
> > include/hw/qdev-core.h             |   2 +
> > include/tcg/startup.h              |  13 +
> > include/tcg/tcg.h                  |   1 +
> > system/physmem.c                   |   8 +-
> > target/arm/arm-powerctl.c          |  20 +-
> > target/arm/cpu-qom.h               |  18 +-
> > target/arm/cpu.c                   | 178 +++++-
> > target/arm/cpu.h                   |  18 +
> > target/arm/cpu64.c                 |  18 +
> > target/arm/gdbstub.c               |   6 +
> > target/arm/helper.c                |  27 +-
> > target/arm/internals.h             |  14 +-
> > target/arm/kvm.c                   | 146 ++++-
> > target/arm/kvm_arm.h               |  24 +
> > target/arm/meson.build             |   1 +
> > target/arm/{tcg => }/psci.c        |   8 +
> > target/arm/tcg/meson.build         |   4 -
> > tcg/region.c                       |  14 +
> > tcg/tcg.c                          |  46 +-
> > 39 files changed, 1714 insertions(+), 364 deletions(-)
> > rename target/arm/{tcg => }/psci.c (97%)
> >
> > --
> > 2.34.1
> >
>
>

Reply via email to