On 2016-04-14 20:31, Peter Xu wrote: > This patch allows Intel IR work with splitted irqchip. Two more fields > are added to IOAPICCommonState to support the translation process (For > future AMD IR support, we will need to provide another AMD-specific > callback for int_remap()). In split irqchip mode, IOAPIC is working in > user space, only update kernel irq routes when entry changed. When IR is > enabled, we directly update the kernel with translated messages. It > works just like a kernel cache for the remapping entries. > > Since KVM irqfd is using kernel gsi routes to deliver interrupts, as > long as we can support split irqchip, we will support irqfd as > well. Also, since kernel gsi routes will cache translated interrupts, > irqfd delivery will not suffer from any performance impact due to IR. > > And, since we supported irqfd, vhost devices will be able to work > seamlessly with IR now. Logically this should contain both vhost-net and > vhost-user case. > > Here we avoided capturing IOMMU IR invalidation, based on the assumption > that, guest kernel will always first update IR entry, then IOAPIC > entry. As long as guest follows this order to update IOAPIC entries, we > should be safe. > > Signed-off-by: Peter Xu <pet...@redhat.com> > --- > hw/i386/intel_iommu.c | 9 +++++++-- > hw/intc/ioapic.c | 39 > ++++++++++++++------------------------- > hw/intc/ioapic_common.c | 4 ++++ > include/hw/i386/intel_iommu.h | 2 ++ > include/hw/i386/ioapic_internal.h | 5 +++++ > 5 files changed, 32 insertions(+), 27 deletions(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 68ebc1e..104afeb 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -2077,9 +2077,9 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState > *iommu, > uint16_t index = 0; > VTDIrq irq = {0}; > > - assert(iommu && origin && translated); > + assert(origin && translated); > > - if (!iommu->intr_enabled) { > + if (!iommu || !iommu->intr_enabled) { > memcpy(translated, origin, sizeof(*origin)); > return 0; > } > @@ -2143,6 +2143,11 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState > *iommu, > return 0; > } > > +int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst) > +{ > + return vtd_interrupt_remap_msi(iommu, src, dst); > +} > + > static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size) > { > uint64_t data = 0; > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c > index 84e8948..b993bd0 100644 > --- a/hw/intc/ioapic.c > +++ b/hw/intc/ioapic.c > @@ -182,34 +182,23 @@ static void ioapic_update_kvm_routes(IOAPICCommonState > *s) > { > #ifdef CONFIG_KVM > int i; > + int ret; > > if (kvm_irqchip_is_split()) { > for (i = 0; i < IOAPIC_NUM_PINS; i++) { > - uint64_t entry = s->ioredtbl[i]; > - uint8_t trig_mode; > - uint8_t delivery_mode; > - uint8_t dest; > - uint8_t dest_mode; > - uint64_t pin_polarity; > - MSIMessage msg; > - > - trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1); > - dest = entry >> IOAPIC_LVT_DEST_SHIFT; > - dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1; > - pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1; > - delivery_mode = > - (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK; > - > - msg.address = APIC_DEFAULT_ADDRESS; > - msg.address |= dest_mode << 2; > - msg.address |= dest << 12; > - > - msg.data = entry & IOAPIC_VECTOR_MASK; > - msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT; > - msg.data |= pin_polarity << APIC_POLARITY_SHIFT; > - msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT; > - > - kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL); > + MSIMessage src, dst; > + struct ioapic_entry_info info; > + ioapic_entry_parse(s->ioredtbl[i], &info); > + src.address = info.addr; > + src.data = info.data; > + /* We update kernel irqchip routes with translated > + * results. */ > + ret = s->int_remap(s->iommu, &src, &dst); > + if (ret) { > + DPRINTF("Int remap failed: %d, drop interrupt\n", ret); > + continue; > + } > + kvm_irqchip_update_msi_route(kvm_state, i, dst, NULL);
The need to hook here makes me wonder if we can't inject IOAPIC interrupts via KVM_SIGNAL_MSI (abstracted by kvm_irqchip_send_msi, but that will pick the fast-path on kernels supporting split irqchip) instead of open-coding the route changes. If we translated the IOAPIC outputs always into MSIs, the need for special-casing split irqchip would be gone, and the need for hooking here for IR just as well. Jan