O-check interface for driver's error handling

Benjamin Herrenschmidt Tue, 01 Mar 2005 14:35:03 -0800

On Tue, 2005-03-01 at 12:33 -0600, Linas Vepstas wrote:

> The current proposal (and prototype) has a "master recovery thread"
> to handle the coordinated reset of the pci controller.  This master
> recovery thyread makes three calls in struct pci_driver:
> 
>    void (*frozen) (struct pci_dev *);  /* called when dev is first frozen */
>    void (*thawed) (struct pci_dev *);  /* called after card is reset */
>    void (*perm_failure) (struct pci_dev *);  /* called if card is dead */


See my other emails. I think only one callback is enough, and I think we
need more parameters.

> The master recovery thread runs in the kernel.  Earlier suggestions said
> "run it in user space, use pci hotplug, use udev, etc." However, if
> you get a pci error on a scsi card, you can't shell script 
> "umount /dev/sdX; rmmod scsi; clear_pci_error; insmod scsi; mount /dev/sdX"
> beacuse you can't umount an open filesystem, and you can't really close
> it (I fiddled with prototyping some of this, but its ugly and painful
> and bizarre and outside my area of expertise :)
> 
> FWIW, the current prototype tries to do a pci hotplug if the above
> routines aren't implemented in struct pci_driver.  It can recover 
> from pci errors on ethernet cards, and I have one scsi driver that
> successfully recovers with above API, and am working on adding recovery
> to the symbios driver.
> 
> --linas
-- 
Benjamin Herrenschmidt <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH/RFC] I/O-check interface for driver's error handling

Reply via email to