On Wed, May 17, 2017 at 07:50:42AM +0200, Cédric Le Goater wrote: > On 05/16/2017 06:10 PM, Greg Kurz wrote: > > On Tue, 16 May 2017 17:18:27 +0200 > > Cédric Le Goater <c...@kaod.org> wrote: > > > >> On 05/16/2017 02:55 PM, Laurent Vivier wrote: > >>> On 16/05/2017 14:50, Cédric Le Goater wrote: > >>>> On 05/16/2017 02:03 PM, Laurent Vivier wrote: > >>>>> On 26/04/2017 09:00, David Gibson wrote: > >>>>>> From: Cédric Le Goater <c...@kaod.org> > >>>>>> > >>>>>> Today, all the ICPs are created before the CPUs, stored in an array > >>>>>> under the sPAPR machine and linked to the CPU when the core threads > >>>>>> are realized. This modeling brings some complexity when a lookup in > >>>>>> the array is required and it can be simplified by allocating the ICPs > >>>>>> when the CPUs are. > >>>>>> > >>>>>> This is the purpose of this proposal which introduces a new 'icp_type' > >>>>>> field under the machine and creates the ICP objects of the right type > >>>>>> (KVM or not) before the PowerPCCPU object are. > >>>>>> > >>>>>> This change allows more cleanups : the removal of the icps array under > >>>>>> the sPAPR machine and the removal of the xics_get_cpu_index_by_dt_id() > >>>>>> helper. > >>>>>> > >>>>>> Signed-off-by: Cédric Le Goater <c...@kaod.org> > >>>>>> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> > >>>>>> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au> > >>>>>> --- > >>>>>> hw/intc/xics.c | 11 ----------- > >>>>>> hw/ppc/spapr.c | 47 > >>>>>> ++++++++++++++--------------------------------- > >>>>>> hw/ppc/spapr_cpu_core.c | 18 ++++++++++++++---- > >>>>>> include/hw/ppc/spapr.h | 2 +- > >>>>>> include/hw/ppc/xics.h | 2 -- > >>>>>> 5 files changed, 29 insertions(+), 51 deletions(-) > >>>>>> > >>>>> > >>>>> This commit breaks CPU re-hotplugging with KVM > >>>>> > >>>>> the sequence "device_add, device_del, device_add" brings to the > >>>>> following error message: > >>>>> > >>>>> Unable to connect CPUx to kernel XICS: Device or resource busy > >>>>> > >>>>> It comes from icp_kvm_cpu_setup(): > >>>>> > >>>>> ... > >>>>> ret = kvm_vcpu_enable_cap(cs, KVM_CAP_IRQ_XICS, 0, kernel_xics_fd, > >>>>> kvm_arch_vcpu_id(cs)); > >>>>> if (ret < 0) { > >>>>> error_report("Unable to connect CPU%ld to kernel XICS: %s", > >>>>> kvm_arch_vcpu_id(cs), strerror(errno)); > >>>>> exit(1); > >>>>> } > >>>>> .. > >>>>> > >>>>> It should be protected by cap_irq_xics_enabled: > >>>>> > >>>>> ... > >>>>> /* > >>>>> * If we are reusing a parked vCPU fd corresponding to the CPU > >>>>> * which was hot-removed earlier we don't have to renable > >>>>> * KVM_CAP_IRQ_XICS capability again. > >>>>> */ > >>>>> if (icp->cap_irq_xics_enabled) { > >>>>> return; > >>>>> } > >>>>> > >>>>> ... > >>>>> ret = kvm_vcpu_enable_cap(...); > >>>>> ... > >>>>> icp->cap_irq_xics_enabled = true; > >>>>> ... > >>>>> > >>>>> But since this commit, "icp" is a new object on each call: > >>>>> > >>>>> spapr_cpu_core_realize_child() > >>>>> ... > >>>>> obj = object_new(spapr->icp_type); > >>>>> ... > >>>>> xics_cpu_setup(XICS_FABRIC(spapr), cpu, ICP(obj)); > >>>>> ... > >>>>> icpc->cpu_setup(icp, cpu); -> icp_kvm_cpu_setup() > >>>>> ... > >>>>> ... > >>>>> > >>>>> and "cap_irq_xics_enabled" is reinitialized. > >>>>> > >>>>> Any idea how to fix that? > >>>> > >>>> it seems that a cleanup is not done in the kernel. We are missing > >>>> a way to call kvmppc_xics_free_icp() from QEMU. Today the only > >>>> way is to destroy the vcpu. > >>> > >>> The commit introducing this hack, for reference: > >>> > >>> commit a45863bda90daa8ec39e5a312b9734fd4665b016 > >>> Author: Bharata B Rao <bhar...@linux.vnet.ibm.com> > >>> Date: Thu Jul 2 16:23:20 2015 +1000 > >>> > >>> xics_kvm: Don't enable KVM_CAP_IRQ_XICS if already enabled > >>> > >>> When supporting CPU hot removal by parking the vCPU fd and reusing > >>> it during hotplug again, there can be cases where we try to reenable > >>> KVM_CAP_IRQ_XICS CAP for the vCPU for which it was already enabled. > >>> Introduce a boolean member in ICPState to track this and don't > >>> reenable the CAP if it was already enabled earlier. > >>> > >>> Re-enabling this CAP should ideally work, but currently it results in > >>> kernel trying to create and associate ICP with this vCPU and that > >>> fails since there is already an ICP associated with it. Hence this > >>> patch is needed to work around this problem in the kernel. > >>> > >>> This change allows CPU hot removal to work for sPAPR. > >>> > >>> Signed-off-by: Bharata B Rao <bhar...@linux.vnet.ibm.com> > >>> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> > >>> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au> > >>> Signed-off-by: Alexander Graf <ag...@suse.de> > >> > >> OK. > >> > >> Greg is looking at re-adding the ICPState array because of a > >> migration issue with older machines. We might need to do so > >> unconditionally ... > >> > > > > That would be a pity to carry on with the pre-allocated ICPStates for > > new machine types just because of that... What about keeping track > > of all the cap_irq_xics_enabled flags in a separate max_cpus sized > > static array ? > > Could we use 'cpu->unplug' instead ?
I've only half followed this discussion, but fwiw I prefer the idea of "parking" in-kernel ICP objects, similarly to the way we do for removed VCPUs, rather than going back to keeping ICP objects around indefinitely and unconditionally. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature