Re: [Qemu-devel] [PATCH v2 1/3] q35: set split kernel irqchip as default

Alex Williamson Mon, 29 Apr 2019 07:23:20 -0700

On Sat, 27 Apr 2019 10:09:51 +0200
Paolo Bonzini <pbonz...@redhat.com> wrote:

> On 27/04/19 07:29, Paolo Bonzini wrote:
> >   
> >>> In my testing it looks like KVM advertises supporting the KVM_IRQFD
> >>> resample feature, but vfio never gets the unmask notification, so the
> >>> device remains with DisINTx set and no further interrupts are
> >>> generated.  Do we expect KVM's IRQFD with resampler to work in the
> >>> split IRQ mode?  We can certainly hope that "high performance" devices
> >>> use MSI or MSI/X, but this would be quite a performance regression with
> >>> split mode if our userspace bypass for INTx goes away.  Thanks,  
> >>
> >> arch/x86/kvm/lapic.c:kvm_ioapic_send_eoi() dumps to userspace before
> >> kvm_ioapic_update_eoi() can handle the irq_ack_notifier_list via
> >> kvm_notify_acked_gsi(),  
> > 
> > That wouldn't help because kvm_ioapic_update_eoi would not even be
> > able to access vcpu->kvm->arch.vioapic (it's NULL).
> > 
> > The following untested patch would signal the resamplefd in 
> > kvm_ioapic_send_eoi,
> > before requesting the exit to userspace.  However I am not sure how QEMU
> > sets up the VFIO eventfds: if I understand correctly, when VFIO writes 
> > again to
> > the irq eventfd, the interrupt request would not reach the userspace 
> > IOAPIC, but
> > only the in-kernel LAPIC.  That would be incorrect, and if my understanding 
> > is
> > correct we need to trigger resampling from hw/intc/ioapic.c.  
> 
> Actually it's worse: because you're bypassing IOAPIC when raising the
> irq, the IOAPIC's remote_irr for example will not be set.  So split
> irqchip currently must disable the intx fast path completely.
> 
> I guess we could also reimplement irqfd and resamplefd in the userspace
> IOAPIC, and run the listener in a separate thread (using "-object
> iothread" on the command line and AioContext in the code).

This sounds like a performance regression vs KVM irqchip any way we
slice it.  Was this change a mistake?  Without KVM support, the
universal support in QEMU kicks in, where device mmaps are disabled
when an INTx occurs, forcing trapped access to the device, and we
assume that the next access is in response to an interrupt and trigger
our own internal EOI and re-enable mmaps.  A timer acts as a
catch-all.  Needless to say, this is functional but not fast.  It would
be a massive performance regression for devices depending on INTx and
previously using the KVM bypass to switch to this.  INTx is largely
considered a legacy interrupt, so non-x86 archs don't encounter it as
often, S390 even explicitly disables INTx support.  ARM and POWER
likely just don't see a lot of these devices, but nearly all devices
(except SR-IOV VFs) on x86 expect an INTx fallback mode and some
drivers may run the device in INTx for compatibility.  This split
irqchip change was likely fine for "enterprise" users concerned only
with modern high speed devices, but very much not for device assignment
used for compatibility use cases or commodity hardware users.

What's a good 4.0.1 strategy to resolve this?  Re-instate KVM irqchip
as the Q35 default?  I can't see that simply switching to current QEMU
handling is a viable option for performance?  What about 4.1?  We could
certainly improve EOI support in QEMU, there's essentially no support
currently, but it seems like an uphill battle for an iothread based
userspace ioapic to ever compare to KVM handling?  Thanks,

Alex

Re: [Qemu-devel] [PATCH v2 1/3] q35: set split kernel irqchip as default

Reply via email to