On Mon, 22 Sep 2025 11:34:23 -0500 (CDT) Timothy Pearson <tpear...@raptorengineering.com> wrote:
> ----- Original Message ----- > > From: "Alex Williamson" <alex.william...@redhat.com> > > To: "Timothy Pearson" <tpear...@raptorengineering.com> > > Cc: "kvm" <k...@vger.kernel.org>, "linuxppc-dev" > > <linuxppc-dev@lists.ozlabs.org> > > Sent: Monday, September 22, 2025 11:01:43 AM > > Subject: Re: [PATCH] vfio/pci: Fix INTx handling on legacy DisINTx- PCI > > devices > > > On Sat, 20 Sep 2025 14:25:03 -0500 (CDT) > > Timothy Pearson <tpear...@raptorengineering.com> wrote: > >> Personally, I'd argue that such old devices were intended to work > >> with much slower host systems, therefore the slowdown probably > >> doesn't matter vs. being more correct in terms of interrupt handling. > >> In terms of general kernel design, my understanding has always been > >> is that best practice is to always mask, disable, or clear a level > >> interrupt before exiting the associated IRQ handler, and the current > >> design seems to violate that rule. In that context, I'd personally > >> want to see an argument as to why echewing this traditional IRQ > >> handler design is beneficial enough to justify making the VFIO driver > >> dependent on platform-specific behavior. > > > > Yep, I kind of agree. The unlazy flag seems to provide the more > > intended behavior. It moves the irq chip masking into the fast path, > > whereas it would have been asynchronous on a subsequent interrupt > > previously, but the impact is only to ancient devices operating in INTx > > mode, so as long as we can verify those still work on both ppc and x86, > > I don't think it's worth complicating the code to make setting the > > unlazy flag conditional on anything other than the device support. > > > > Care to send out a new version documenting the actual sequence fixed by > > this change and updating the code based on this thread? Note that we > > can test non-pci2.3 mode for any device/driver that supports INTx using > > the nointxmask=1 option for vfio-pci and booting a linux guest with > > pci=nomsi. Thanks, > > > > Alex > > Sure, I can update the commit message easily enough, but I must have > missed something in regard to a needed code update. The existing > patch only sets unlazy for non-PCI 2.3 INTX devices, and as I > understand it that's the behavior we have both agreed on at this > point? I had commented[1] that testing the interrupt type immediately after setting the interrupt type is redundant. Also, looking again, if we set the flag before request_irq, it seems logical that we'd clear the flag after free_irq. I think there are also some unaccounted error paths where we can set the flag without clearing it that need to be considered. > I've tested this on ppc64el and it works quite well, repairing the > broken behavior where the guest would receive exactly one interrupt > on the legacy PCI device per boot. I don't have amd64 systems > available to test on, however. Noted, I'll incorporate some targeted testing here. Thanks, Alex [1]https://lore.kernel.org/all/20250919125603.08f600ac.alex.william...@redhat.com/