Hi Gavin, > From: Gavin Shan <gs...@redhat.com> > Sent: Wednesday, September 27, 2023 7:29 AM > To: Salil Mehta <salil.me...@huawei.com>; qemu-devel@nongnu.org; qemu- > a...@nongnu.org > Cc: m...@kernel.org; jean-phili...@linaro.org; Jonathan Cameron > <jonathan.came...@huawei.com>; lpieral...@kernel.org; > peter.mayd...@linaro.org; richard.hender...@linaro.org; > imamm...@redhat.com; andrew.jo...@linux.dev; da...@redhat.com; > phi...@linaro.org; eric.au...@redhat.com; w...@kernel.org; a...@kernel.org; > oliver.up...@linux.dev; pbonz...@redhat.com; m...@redhat.com; > raf...@kernel.org; borntrae...@linux.ibm.com; alex.ben...@linaro.org; > li...@armlinux.org.uk; dar...@os.amperecomputing.com; > il...@os.amperecomputing.com; vis...@os.amperecomputing.com; > karl.heub...@oracle.com; miguel.l...@oracle.com; salil.me...@opnsrc.net; > zhukeqian <zhukeqi...@huawei.com>; wangxiongfeng (C) > <wangxiongfe...@huawei.com>; wangyanan (Y) <wangyana...@huawei.com>; > jiakern...@gmail.com; maob...@loongson.cn; lixiang...@loongson.cn > Subject: Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time > change common to vCPU {cold|hot}-plug > > Hi Salil, > > On 9/26/23 20:04, Salil Mehta wrote: > > Refactor and introduce the common logic required during the > initialization of > > both cold and hot plugged vCPUs. Also initialize the *disabled* state of the > > vCPUs which shall be used further during init phases of various other > > components > > like GIC, PMU, ACPI etc as part of the virt machine initialization. > > > > KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in > > powered-off state in the KVM Host and do not run the guest code. Plugged > > vCPUs > > are also kept in powered-off state but vCPU threads exist and is kept > > sleeping. > > > > TBD: > > For the cold booted vCPUs, this change also exists in the arm_load_kernel() > > in boot.c but for the hotplugged CPUs this change should still remain part > > of > > the pre-plug phase. We are duplicating the powering-off of the cold booted > > CPUs. > > Shall we remove the duplicate change from boot.c? > > > > Co-developed-by: Salil Mehta <salil.me...@huawei.com> > > Signed-off-by: Salil Mehta <salil.me...@huawei.com> > > Co-developed-by: Keqian Zhu <zhukeqi...@huawei.com> > > Signed-off-by: Keqian Zhu <zhukeqi...@huawei.com> > > Reported-by: Gavin Shan <gavin.s...@redhat.com> > > [GS: pointed the assertion due to wrong range check] > > Signed-off-by: Salil Mehta <salil.me...@huawei.com> > > --- > > hw/arm/virt.c | 149 ++++++++++++++++++++++++++++++++++++++++----- > > target/arm/cpu.c | 7 +++ > > target/arm/cpu64.c | 14 +++++ > > 3 files changed, 156 insertions(+), 14 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 0eb6bf5a18..3668ad27ec 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -221,6 +221,7 @@ static const char *valid_cpus[] = { > > ARM_CPU_TYPE_NAME("max"), > > }; > > > > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid); > > static int virt_get_socket_id(const MachineState *ms, int cpu_index); > > static int virt_get_cluster_id(const MachineState *ms, int cpu_index); > > static int virt_get_core_id(const MachineState *ms, int cpu_index); > > @@ -2154,6 +2155,14 @@ static void machvirt_init(MachineState *machine) > > exit(1); > > } > > > > + finalize_gic_version(vms); > > + if (tcg_enabled() || hvf_enabled() || qtest_enabled() || > > + (vms->gic_version < VIRT_GIC_VERSION_3)) { > > + machine->smp.max_cpus = smp_cpus; > > + mc->has_hotpluggable_cpus = false; > > + warn_report("cpu hotplug feature has been disabled"); > > + } > > + > > Comments needed here to explain why @mc->has_hotpluggable_cpus is set to > false. > I guess it's something related to TODO list, mentioned in the cover letter.
I can put a comment explaining the checks as to why feature has been disabled. BTW, isn't code self-explanatory here? [...] > > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid) > > +{ > > + VirtMachineState *vms = VIRT_MACHINE(ms); > > + CPUArchId *found_cpu; > > + uint64_t mp_affinity; > > + > > + assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len); > > + > > + /* > > + * RFC: Question: > > + * TBD: Should mp-affinity be treated as MPIDR? > > + */ > > + mp_affinity = virt_cpu_mp_affinity(vms, vcpuid); > > + found_cpu = &ms->possible_cpus->cpus[vcpuid]; > > + > > + assert(found_cpu->arch_id == mp_affinity); > > + > > + /* > > + * RFC: Question: > > + * Slot-id is the index where vCPU with certain > > arch-id(=mpidr/ap-affinity) > > + * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id. > > + * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id > > is > > + * more related to machine? Current code assumes slot-id and vcpu-id > > are > > + * same i.e. meaning of slot is bit vague. > > + * > > + * Q1: Is there any requirement to clearly represent slot and > > dissociate it > > + * from vcpu-id? > > + * Q2: Should we make MPIDR within host KVM user configurable? > > + * > > + * +----+----+----+----+----+----+----+----+ > > + * MPIDR ||| Res | Aff2 | Aff1 | Aff0 | > > + * +----+----+----+----+----+----+----+----+ > > + * \ \ \ | | > > + * \ 8bit \ 8bit \ |4bit| > > + * \<------->\<------->\ |<-->| > > + * \ \ \| | > > + * +----+----+----+----+----+----+----+----+ > > + * VCPU-ID | Byte4 | Byte2 | Byte1 | Byte0 | > > + * +----+----+----+----+----+----+----+----+ > > + */ > > + > > + return found_cpu; > > +} > > + > > MPIDR[31] is set to 0b1, looking at > linux/arch/arm64/kvm/sys_regs.c::reset_mpidr(). > > I think this function can be renamed to virt_get_cpu_slot(ms, index), better > to > reflect its intention. I had same concerns why cs->cpu_index can't be > reused as MPIDR, but it's out of scope for this series. It maybe something to > be > improved afterwards. Yes, right now it is linear mapping but this might change. I would suggest to keep it like this with a comment so that it can be addressed in future. User configurability of the MPIDR is not in the scope of this patch. Agreed. [...] > > +static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState > *dev, > > + Error **errp) > > +{ > > + VirtMachineState *vms = VIRT_MACHINE(hotplug_dev); > > + MachineState *ms = MACHINE(hotplug_dev); > > + ARMCPU *cpu = ARM_CPU(dev); > > + CPUState *cs = CPU(dev); > > + CPUArchId *cpu_slot; > > + int32_t min_cpuid = 0; > > + int32_t max_cpuid; > > + > > + /* sanity check the cpu */ > > + if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) { > > + error_setg(errp, "Invalid CPU type, expected cpu type: '%s'", > > + ms->cpu_type); > > + return; > > + } > > + > > + if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) { > > + error_setg(errp, "Invalid thread-id %u specified, correct range > 0:%u", > > + cpu->thread_id, ms->smp.threads - 1); > > + return; > > + } > > + > > + max_cpuid = ms->possible_cpus->len - 1; > > + if (!dev->hotplugged) { > > + min_cpuid = vms->acpi_dev ? ms->smp.cpus : 0; > > + max_cpuid = vms->acpi_dev ? max_cpuid : ms->smp.cpus - 1; > > + } > > + > > I don't understand how the range is figured out. cpu->core_id should > be in range [0, ms->smp.cores). > With your code, the following scenario > becomes invalid incorrectly? > > -cpu host -smp maxcpus=4,cpus=1,sockets=4,clusters=1,cores=1,threads=1 Ghosh. I am not sure what I was thinking while I added this. Whatever maybe your circumstances never drink and code. Deadly combination! (Repeat offender) Will correct this. Thanks Salil. [...] > > + > > +static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > + Error **errp) > > +{ > > + MachineState *ms = MACHINE(hotplug_dev); > > + CPUState *cs = CPU(dev); > > + CPUArchId *cpu_slot; > > + > > + /* insert the cold/hot-plugged vcpu in the slot */ > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > May be: > > /* CPU becomes present */ Not exactly. In this leg CPU is being plugged by user action or during init time. After plugging action is complete, a CPU eventually becomes present. > > > + cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index); > > + cpu_slot->cpu = OBJECT(dev); > > + > > + cs->disabled = false; > > + return; > ^^^^^^ > > not needed. Agreed. > > May be worthy some comments like below, correlating to what's done in > aarch64_cpu_initfn(): > > /* CPU becomes enabled after it's hot added */ I can add a line over the initialization, if thats what you mean? > > > +} > > + > > static void virt_machine_device_pre_plug_cb(HotplugHandler > *hotplug_dev, > > DeviceState *dev, Error > **errp) [...] > > +static void aarch64_cpu_initfn(Object *obj) > > +{ > > + CPUState *cs = CPU(obj); > > + > > + /* > > + * we start every ARM64 vcpu as disabled possible vCPU. It needs to > be > > + * enabled explicitly > > + */ > > + cs->disabled = true; > > +} > > + > > The comments can be simplified to: > > /* The CPU state isn't enabled until it's hot added completely */ There is a reason why I have added comment that way because for other architectures 'disabled' would be false by default. > > static void aarch64_cpu_finalizefn(Object *obj) > > { > > } > > @@ -751,7 +762,9 @@ static gchar *aarch64_gdb_arch_name(CPUState *cs) > > static void aarch64_cpu_class_init(ObjectClass *oc, void *data) > > { > > CPUClass *cc = CPU_CLASS(oc); > > + DeviceClass *dc = DEVICE_CLASS(oc); > > > > + dc->user_creatable = true; > > cc->gdb_read_register = aarch64_cpu_gdb_read_register; > > cc->gdb_write_register = aarch64_cpu_gdb_write_register; > > cc->gdb_num_core_regs = 34; > > @@ -800,6 +813,7 @@ static const TypeInfo aarch64_cpu_type_info = { > > .name = TYPE_AARCH64_CPU, > > .parent = TYPE_ARM_CPU, > > .instance_size = sizeof(ARMCPU), > > + .instance_init = aarch64_cpu_initfn, > > .instance_finalize = aarch64_cpu_finalizefn, > > .abstract = true, > > .class_size = sizeof(AArch64CPUClass), > > I'm not sure if 'dc->user_creatable' can be set true here because > the ARMCPU objects aren't ready for hot added/removed at this point. > The hacks for GICv3 aren't included so far. I think a separate patch > may be needed in the last to enable the functionality? This patch contains common init time changes for CPU {hot,cold} plug. Thanks Salil.