On Fri, 2006-04-07 at 06:24, Linas Vepstas wrote: > Please apply and forward upstream. > > --linas > > [PATCH] PCI Error Recovery: e100 network device driver > > Various PCI bus errors can be signaled by newer PCI controllers. This > patch adds the PCI error recovery callbacks to the intel ethernet e100 > device driver. The patch has been tested, and appears to work well. > > Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]> > Acked-by: Jesse Brandeburg <[EMAIL PROTECTED]> I am enabling PCI-Express AER (Advanced Error Reporting) in kernel and glad to see many drivers to support pci error handling.
> > ---- > > drivers/net/e100.c | 65 > +++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 65 insertions(+) > > Index: linux-2.6.17-rc1/drivers/net/e100.c > =================================================================== > --- linux-2.6.17-rc1.orig/drivers/net/e100.c 2006-04-05 09:56:06.000000000 > -0500 > +++ linux-2.6.17-rc1/drivers/net/e100.c 2006-04-06 15:17:29.000000000 > -0500 > @@ -2781,6 +2781,70 @@ static void e100_shutdown(struct pci_dev > } > > > +/* ------------------ PCI Error Recovery infrastructure -------------- */ > +/** e100_io_error_detected() is called when PCI error is detected */ > +static pci_ers_result_t e100_io_error_detected(struct pci_dev *pdev, > pci_channel_state_t state) > +{ > + struct net_device *netdev = pci_get_drvdata(pdev); > + > + /* Same as calling e100_down(netdev_priv(netdev)), but generic */ > + netdev->stop(netdev); e100 stop method e100_close calls e100_down which would do IO. Does it violate the rule defined in Documentation/pci-error-recovery.txt that error_detected shouldn't do any IO? Suggest to create a new function, such like e100_close_noreset. > + > + /* Detach; put netif into state similar to hotplug unplug */ > + netif_poll_enable(netdev); > + netif_device_detach(netdev); > + > + /* Request a slot reset. */ > + return PCI_ERS_RESULT_NEED_RESET; > +} > + > +/** e100_io_slot_reset is called after the pci bus has been reset. > + * Restart the card from scratch. */ > +static pci_ers_result_t e100_io_slot_reset(struct pci_dev *pdev) > +{ > + struct net_device *netdev = pci_get_drvdata(pdev); > + struct nic *nic = netdev_priv(netdev); > + > + if(pci_enable_device(pdev)) { > + printk(KERN_ERR "e100: Cannot re-enable PCI device after > reset.\n"); > + return PCI_ERS_RESULT_DISCONNECT; > + } > + pci_set_master(pdev); > + > + /* Only one device per card can do a reset */ > + if (0 != PCI_FUNC (pdev->devfn)) > + return PCI_ERS_RESULT_RECOVERED; > + e100_hw_reset(nic); > + e100_phy_init(nic); Should pci_set_master be called after e100_hw_reset like in function e100_probe? > + > + return PCI_ERS_RESULT_RECOVERED; > +} > + > +/** e100_io_resume is called when the error recovery driver > + * tells us that its OK to resume normal operation. > + */ > +static void e100_io_resume(struct pci_dev *pdev) > +{ > + struct net_device *netdev = pci_get_drvdata(pdev); > + struct nic *nic = netdev_priv(netdev); > + > + /* ack any pending wake events, disable PME */ > + pci_enable_wake(pdev, 0, 0); > + > + netif_device_attach(netdev); > + if(netif_running(netdev)) { > + e100_open (netdev); > + mod_timer(&nic->watchdog, jiffies); e100_open calls e100_up which already sets watchdog timer. Why to set it again? > + } > +} > + - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html