On Tue, 17 Mar 2020 17:41:08 -0400 Peter Xu <pet...@redhat.com> wrote:
> On Tue, Mar 17, 2020 at 03:06:46PM -0600, Alex Williamson wrote: > > [...] > > > > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c > > > index 15747fe2c2..81a17cc2b8 100644 > > > --- a/hw/intc/ioapic.c > > > +++ b/hw/intc/ioapic.c > > > @@ -236,8 +236,29 @@ void ioapic_eoi_broadcast(int vector) > > > for (n = 0; n < IOAPIC_NUM_PINS; n++) { > > > entry = s->ioredtbl[n]; > > > > > > - if ((entry & IOAPIC_VECTOR_MASK) != vector || > > > - ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) != > > > IOAPIC_TRIGGER_LEVEL) { > > > + if ((entry & IOAPIC_VECTOR_MASK) != vector) { > > > + continue; > > > + } > > > + > > > + /* > > > + * When IOAPIC is in the userspace while APIC is still in > > > + * the kernel (i.e., split irqchip), we have a trick to > > > + * kick the resamplefd logic for registered irqfds from > > > + * userspace to deactivate the IRQ. When that happens, it > > > + * means the irq bypassed userspace IOAPIC (so the irr and > > > + * remote-irr of the table entry should be bypassed too > > > + * even if interrupt come). Still kick the resamplefds if > > > + * they're bound to the IRQ, to make sure to EOI the > > > + * interrupt for the hardware correctly. > > > + * > > > + * Note: We still need to go through the irr & remote-irr > > > + * operations below because we don't know whether there're > > > + * emulated devices that are using/sharing the same IRQ. > > > + */ > > > + kvm_resample_fd_notify(n); > > > + > > > + if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) != > > > + IOAPIC_TRIGGER_LEVEL) { > > > continue; > > > } > > > > > > > What's the logic for sending resampler notifies before testing if the > > ioapic entry is in level triggered mode? vfio won't use this for > > anything other than level triggered. Inserting it between these checks > > confused me and in my testing wasn't necessary. Thanks, > > I put it there to match the kernel implementation, and IIUC Paolo > agreed with that too: > > https://patchwork.kernel.org/patch/11407441/#23190969 > > Since we've discussed a few times here, I think I can talk a bit more > on how I understand this in case I was wrong... > > Even if we have the fact that all the existing devices that use this > code should be using level-triggered IRQs, however... *If* there comes > an edge-triggered INTx device and we assign it using vfio-pci, vfio > should also mask the IRQ after it generates (according to > vfio_intx_handler), is that right? Then we still need to kick the > resamplefd for that does-not-exist device too to make sure it'll work? "edge-triggered INTx" is not a thing that exists. The PCI spec defines interrupt pins as: 2.2.6. Interrupt Pins (Optional) Interrupts on PCI are optional and defined as "level sensitive," asserted low (negative true), using open drain output drivers. Masking of interrupts while they're in-service is not done for edge triggered interrupts, we assume that being a discrete interrupt is a sufficient rate limiter versus a level triggered interrupt, which is continuous and can saturate the host. If it exists before the level check only to match the kernel, maybe a comment or todo item to check whether it's the optimal approach for both cases should be in order. I can't think of any reason why we'd need it for the sake of edge triggered vfio interrupts in either place. Thanks, Alex