On Mon, 2020-03-30 at 15:56 +1100, Sam Bobroff wrote: > EEH device state is currently removed (by eeh_remove_device()) during > the device release handler, which is invoked as the device's reference > count drops to zero. This may take some time, or forever, as other > threads may hold references. > > However, the PCI device state is released synchronously by > pci_stop_and_remove_bus_device(). This mismatch causes problems, for > example the device may be re-discovered as a new device before the > release handler has been called, leaving the PCI and EEH state > mismatched. > > So instead, call eeh_remove_device() from the bus device removal > handlers, which are called synchronously in the removal path. > > Signed-off-by: Sam Bobroff <sbobr...@linux.ibm.com> > --- > arch/powerpc/kernel/eeh.c | 26 ++++++++++++++++++++++++++ > arch/powerpc/kernel/pci-hotplug.c | 2 -- > 2 files changed, 26 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c > index 17cb3e9b5697..c36c5a7db5ca 100644 > --- a/arch/powerpc/kernel/eeh.c > +++ b/arch/powerpc/kernel/eeh.c > @@ -1106,6 +1106,32 @@ static int eeh_init(void) > > core_initcall_sync(eeh_init); > > +static int eeh_device_notifier(struct notifier_block *nb, > + unsigned long action, void *data) > +{ > + struct device *dev = data; > + > + switch (action) { > + case BUS_NOTIFY_DEL_DEVICE: > + eeh_remove_device(to_pci_dev(dev)); > + break; > + default: > + break; > + }
A comment briefly explaining why we're not doing anything in the add case might be nice. Reviewed-by: Oliver O'Halloran <ooh...@gmail.com> > + return NOTIFY_DONE; > +} > + > +static struct notifier_block eeh_device_nb = { > + .notifier_call = eeh_device_notifier, > +}; > + > +static __init int eeh_set_bus_notifier(void) > +{ > + bus_register_notifier(&pci_bus_type, &eeh_device_nb); > + return 0; > +} > +arch_initcall(eeh_set_bus_notifier); > + > /** > * eeh_add_device_early - Enable EEH for the indicated device node > * @pdn: PCI device node for which to set up EEH > diff --git a/arch/powerpc/kernel/pci-hotplug.c > b/arch/powerpc/kernel/pci-hotplug.c > index d6a67f814983..28e9aa274f64 100644 > --- a/arch/powerpc/kernel/pci-hotplug.c > +++ b/arch/powerpc/kernel/pci-hotplug.c > @@ -57,8 +57,6 @@ void pcibios_release_device(struct pci_dev *dev) > struct pci_controller *phb = pci_bus_to_host(dev->bus); > struct pci_dn *pdn = pci_get_pdn(dev); > > - eeh_remove_device(dev); > - > if (phb->controller_ops.release_device) > phb->controller_ops.release_device(dev); >