On Mon, 29 Apr 2019 16:01:29 +1000 Alexey Kardashevskiy <a...@ozlabs.ru> wrote:
> On 20/04/2019 01:34, Greg Kurz wrote: > > Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This > > has the effect of incrementing the reference count of the PCI device, as > > explained in drivers/pci/search.c: > > > > * Given a PCI domain, bus, and slot/function number, the desired PCI > > * device is located in the list of PCI devices. If the device is > > * found, its reference count is increased and this function returns a > > * pointer to its data structure. The caller must decrement the > > * reference count by calling pci_dev_put(). If no device is found, > > * %NULL is returned. > > > > Nothing was done to call pci_dev_put() and the reference count of GPU and > > NPU PCI devices rockets up. > > > > A natural way to fix this would be to teach the callers about the change, > > so that they call pci_dev_put() when done with the pointer. This turns > > out to be quite intrusive, as it affects many paths in npu-dma.c, > > pci-ioda.c and vfio_pci_nvlink2.c. > > > afaict this referencing is only done to protect the current traverser > and what you've done is actually a natural way (and the generic > pci_get_dev_by_id() does exactly the same), although this looks a bit weird. > Not exactly the same: pci_get_dev_by_id() always increment the refcount of the returned PCI device. The refcount is only decremented when this device is passed to pci_get_dev_by_id() to continue searching. That means that the users of the PCI device pointer returned by pci_get_dev_by_id() or its exported variants pci_get_subsys(), pci_get_device() and pci_get_class() do handle the refcount. They all pass the pointer to pci_dev_put() or continue the search, which calls pci_dev_put() internally. Direct and indirect callers of get_pci_dev() don't care for the refcount at all unless I'm missing something. > > > Also, the issue appeared in 4.16 and > > some affected code got moved around since then: it would be problematic > > to backport the fix to stable releases. > > > > All that code never cared for reference counting anyway. Call pci_dev_put() > > from get_pci_dev() to revert to the previous behavior. > >> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev > from pci_dn") > > Cc: sta...@vger.kernel.org # v4.16 > > Signed-off-by: Greg Kurz <gr...@kaod.org> > > --- > > arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++- > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c > > b/arch/powerpc/platforms/powernv/npu-dma.c > > index e713ade30087..d8f3647e8fb2 100644 > > --- a/arch/powerpc/platforms/powernv/npu-dma.c > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c > > @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock); > > static struct pci_dev *get_pci_dev(struct device_node *dn) > > { > > struct pci_dn *pdn = PCI_DN(dn); > > + struct pci_dev *pdev; > > > > - return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus), > > + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus), > > pdn->busno, pdn->devfn); > > + > > + /* > > + * pci_get_domain_bus_and_slot() increased the reference count of > > + * the PCI device, but callers don't need that actually as the PE > > + * already holds a reference to the device. > > Imho this would be just enough. > > Anyway, > > Reviewed-by: Alexey Kardashevskiy <a...@ozlabs.ru> > Thanks ! I now realize that I forgot to add the --cc option for stable on my stgit command line :-\. Cc'ing now. > > How did you find it? :) > While reading code to find some inspiration for OpenCAPI passthrough. :) I saw the following in vfio_pci_ibm_npu2_init(): if (!pnv_pci_get_gpu_dev(vdev->pdev)) return -ENODEV; and simply followed the function calls. > > > Since callers aren't > > + * aware of the reference count change, call pci_dev_put() now to > > + * avoid leaks. > > + */ > > + if (pdev) > > + pci_dev_put(pdev); > > + > > + return pdev; > > } > > > > /* Given a NPU device get the associated PCI device. */ > > >