On Tue, Mar 17, 2020 at 04:12:00PM -0600, Alex Williamson wrote: > On Tue, 17 Mar 2020 17:41:08 -0400 > Peter Xu <pet...@redhat.com> wrote: > > > On Tue, Mar 17, 2020 at 03:06:46PM -0600, Alex Williamson wrote: > > > > [...] > > > > > > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c > > > > index 15747fe2c2..81a17cc2b8 100644 > > > > --- a/hw/intc/ioapic.c > > > > +++ b/hw/intc/ioapic.c > > > > @@ -236,8 +236,29 @@ void ioapic_eoi_broadcast(int vector) > > > > for (n = 0; n < IOAPIC_NUM_PINS; n++) { > > > > entry = s->ioredtbl[n]; > > > > > > > > - if ((entry & IOAPIC_VECTOR_MASK) != vector || > > > > - ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) != > > > > IOAPIC_TRIGGER_LEVEL) { > > > > + if ((entry & IOAPIC_VECTOR_MASK) != vector) { > > > > + continue; > > > > + } > > > > + > > > > + /* > > > > + * When IOAPIC is in the userspace while APIC is still in > > > > + * the kernel (i.e., split irqchip), we have a trick to > > > > + * kick the resamplefd logic for registered irqfds from > > > > + * userspace to deactivate the IRQ. When that happens, it > > > > + * means the irq bypassed userspace IOAPIC (so the irr and > > > > + * remote-irr of the table entry should be bypassed too > > > > + * even if interrupt come). Still kick the resamplefds if > > > > + * they're bound to the IRQ, to make sure to EOI the > > > > + * interrupt for the hardware correctly. > > > > + * > > > > + * Note: We still need to go through the irr & remote-irr > > > > + * operations below because we don't know whether there're > > > > + * emulated devices that are using/sharing the same IRQ. > > > > + */ > > > > + kvm_resample_fd_notify(n); > > > > + > > > > + if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) != > > > > + IOAPIC_TRIGGER_LEVEL) { > > > > continue; > > > > } > > > > > > > > > > What's the logic for sending resampler notifies before testing if the > > > ioapic entry is in level triggered mode? vfio won't use this for > > > anything other than level triggered. Inserting it between these checks > > > confused me and in my testing wasn't necessary. Thanks, > > > > I put it there to match the kernel implementation, and IIUC Paolo > > agreed with that too: > > > > https://patchwork.kernel.org/patch/11407441/#23190969 > > > > Since we've discussed a few times here, I think I can talk a bit more > > on how I understand this in case I was wrong... > > > > Even if we have the fact that all the existing devices that use this > > code should be using level-triggered IRQs, however... *If* there comes > > an edge-triggered INTx device and we assign it using vfio-pci, vfio > > should also mask the IRQ after it generates (according to > > vfio_intx_handler), is that right? Then we still need to kick the > > resamplefd for that does-not-exist device too to make sure it'll work? > > "edge-triggered INTx" is not a thing that exists. The PCI spec defines > interrupt pins as: > > 2.2.6. Interrupt Pins (Optional) > > Interrupts on PCI are optional and defined as "level sensitive," > asserted low (negative true), using open drain output drivers.
Ah OK! I didn't notice it's a spec-wise answer... > > Masking of interrupts while they're in-service is not done for edge > triggered interrupts, we assume that being a discrete interrupt is a > sufficient rate limiter versus a level triggered interrupt, which is > continuous and can saturate the host. > > If it exists before the level check only to match the kernel, maybe a > comment or todo item to check whether it's the optimal approach for > both cases should be in order. I can't think of any reason why we'd > need it for the sake of edge triggered vfio interrupts in either place. I guess the KVM implementation of that is still required for the kernel PIT implementation as Paolo mentioned. Since this seems to be confusing and the userspace does not have a real use case for that, let me repost this patch only so the userspace resamplefd only reacts to level triggered interrupts. Thanks, -- Peter Xu