On Fri, 24 Sep 2021 14:40:24 +0200 Cédric Le Goater <c...@kaod.org> wrote:
> On 9/23/21 11:12, Greg Kurz wrote: > > On Wed, 22 Sep 2021 16:41:20 +0200 > > Cédric Le Goater <c...@kaod.org> wrote: > > > >> When QEMU switches to the XIVE interrupt mode, it creates all possible > >> guest interrupts at the level of the KVM device. These interrupts are > >> backed by real HW interrupts from the IPI interrupt pool of the XIVE > >> controller. > >> > >> Currently, this is done from the QEMU main thread, which results in > >> allocating all interrupts from the chip on which QEMU is running. IPIs > >> are not distributed across the system and the load is not well > >> balanced across the interrupt controllers. > >> > >> To improve distribution on the system, we should try to allocate the > >> underlying HW IPI from the vCPU context. This also benefits to > >> configurations using CPU pinning. In this case, the HW IPI is > >> allocated on the local chip, rerouting between interrupt controllers > >> is reduced and performance improved. > >> > >> This moves the initialization of the vCPU IPIs from reset time to the > >> H_INT_SET_SOURCE_CONFIG hcall which is called from the vCPU context. > >> But this needs some extra checks in the sequences getting and setting > >> the source states to make sure a valid HW IPI is backing the guest > >> interrupt. For that, we check if a target was configured in the END in > >> case of a vCPU IPI. > >> > >> Signed-off-by: Cédric Le Goater <c...@kaod.org> > >> --- > >> > >> I have tested different SMT configurations, kernel_irqchip=off/on, > >> did some migrations, CPU hotplug, etc. It's not enough and I would > >> like more testing but, at least, it is not making anymore the bad > >> assumption vCPU id = IPI number. > >> > > > > Yeah, the IPI number is provided by the guest, so h_int_set_source_config() > > is really the only place where we can know the IPI number of a given vCPU. > > The patch lacks a run_on_cpu() to perform the reset on the vCPU context > to be complete. > Yes since the vCPU doing the hcall is obviously not the target for the IPI :-) > > > >> Comments ? > >> > > > > LGTM but I didn't check if more users of xive_end_is_valid() should > > be converted to using xive_source_is_initialized(). > > I think you mean xive_eas_is_valid() ? > Oops yes... bad copy/paste :-\ > The changes only impact KVM support since we are deferring the IRQ > initialization at the KVM level. What we have to be careful about is not > accessing an ESB page of an interrupt that would not have been initiliazed > in the KVM device, else QEMU gets a sigbus. > Ok, so this is just needed on a path that leads to xive_esb_rw() or kvmppc_xive_esb_trigger(), right ? It seems that h_int_esb() kvmppc_xive_esb_rw() can get there with an lisn provided by the guest, and I don't see any such check on the way : h_int_esb() only checks xive_eas_is_valid(). Cheers, -- Greg > That only happens when QEMU gets/sets the ESB states. > > > Any chance you have some perf numbers to share ? > > I will try. > > Thanks, > > C. > > > >> hw/intc/spapr_xive.c | 17 +++++++++++++++++ > >> hw/intc/spapr_xive_kvm.c | 36 +++++++++++++++++++++++++++++++----- > >> 2 files changed, 48 insertions(+), 5 deletions(-) > >> > >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c > >> index 6f31ce74f198..2dc594a720b1 100644 > >> --- a/hw/intc/spapr_xive.c > >> +++ b/hw/intc/spapr_xive.c > >> @@ -1089,6 +1089,23 @@ static target_ulong > >> h_int_set_source_config(PowerPCCPU *cpu, > >> if (spapr_xive_in_kernel(xive)) { > >> Error *local_err = NULL; > >> > >> + /* > >> + * Initialize the vCPU IPIs from the vCPU context to allocate > >> + * the backing HW IPI on the local chip. This improves > >> + * distribution of the IPIs in the system and when the vCPUs > >> + * are pinned, it reduces rerouting between interrupt > >> + * controllers for better performance. > >> + */ > >> + if (lisn < SPAPR_XIRQ_BASE) { > >> + XiveSource *xsrc = &xive->source; > >> + > >> + kvmppc_xive_source_reset_one(xsrc, lisn, &local_err); > >> + if (local_err) { > >> + error_report_err(local_err); > >> + return H_HARDWARE; > >> + } > >> + } > >> + > >> kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err); > >> if (local_err) { > >> error_report_err(local_err); > >> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c > >> index 53731d158625..1607a59e9483 100644 > >> --- a/hw/intc/spapr_xive_kvm.c > >> +++ b/hw/intc/spapr_xive_kvm.c > >> @@ -254,7 +254,12 @@ static int kvmppc_xive_source_reset(XiveSource *xsrc, > >> Error **errp) > >> SpaprXive *xive = SPAPR_XIVE(xsrc->xive); > >> int i; > >> > >> - for (i = 0; i < xsrc->nr_irqs; i++) { > >> + /* > >> + * vCPU IPIs are initialized at the KVM level when configured by > >> + * H_INT_SET_SOURCE_CONFIG. > >> + */ > >> + > >> + for (i = SPAPR_XIRQ_BASE; i < xsrc->nr_irqs; i++) { > >> int ret; > >> > >> if (!xive_eas_is_valid(&xive->eat[i])) { > >> @@ -342,6 +347,27 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int > >> srcno, uint32_t offset, > >> } > >> } > >> > >> +static bool xive_source_is_initialized(SpaprXive *xive, int lisn) > >> +{ > >> + assert(lisn < xive->nr_irqs); > >> + > >> + if (!xive_eas_is_valid(&xive->eat[lisn])) { > >> + return false; > >> + } > >> + > >> + /* > >> + * vCPU IPIs are initialized at the KVM level when configured by > >> + * H_INT_SET_SOURCE_CONFIG, in which case, we should have a valid > >> + * target (server, priority) in the END. > >> + */ > >> + if (lisn < SPAPR_XIRQ_BASE) { > >> + return !!xive_get_field64(EAS_END_INDEX, xive->eat[lisn].w); > >> + } > >> + > >> + /* Device sources */ > >> + return true; > >> +} > >> + > >> static void kvmppc_xive_source_get_state(XiveSource *xsrc) > >> { > >> SpaprXive *xive = SPAPR_XIVE(xsrc->xive); > >> @@ -350,7 +376,7 @@ static void kvmppc_xive_source_get_state(XiveSource > >> *xsrc) > >> for (i = 0; i < xsrc->nr_irqs; i++) { > >> uint8_t pq; > >> > >> - if (!xive_eas_is_valid(&xive->eat[i])) { > >> + if (!xive_source_is_initialized(xive, i)) { > >> continue; > >> } > >> > >> @@ -533,7 +559,7 @@ static void kvmppc_xive_change_state_handler(void > >> *opaque, bool running, > >> uint8_t pq; > >> uint8_t old_pq; > >> > >> - if (!xive_eas_is_valid(&xive->eat[i])) { > >> + if (!xive_source_is_initialized(xive, i)) { > >> continue; > >> } > >> > >> @@ -561,7 +587,7 @@ static void kvmppc_xive_change_state_handler(void > >> *opaque, bool running, > >> for (i = 0; i < xsrc->nr_irqs; i++) { > >> uint8_t pq; > >> > >> - if (!xive_eas_is_valid(&xive->eat[i])) { > >> + if (!xive_source_is_initialized(xive, i)) { > >> continue; > >> } > >> > >> @@ -666,7 +692,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int > >> version_id) > >> > >> /* Restore the EAT */ > >> for (i = 0; i < xive->nr_irqs; i++) { > >> - if (!xive_eas_is_valid(&xive->eat[i])) { > >> + if (!xive_source_is_initialized(xive, i)) { > >> continue; > >> } > >> > > >