On Wed, Mar 02, 2005 at 11:28:01AM +0900, Hidetoshi Seto was heard to remark: > > Note that here is a difficulty: the MCA handler on some arch would run on > special context - MCA environment. In other words, since some MCA handler > would be called by non-maskable interrupt(e.g. NMI), so it's difficult to > call some driver's callback using protected kernel locks from MCA context. > > Therefore what MCA handler could do is just indicates a error was there, > by something like status flag which drivers can refer. And after possible > deley, we would be able to call callbacks.
FWIW, many device drivers do I/O in an interrupt context (e.g. from timer interrupts), which is why error recovery needs to run in a separate thread from error detection. On ppc64, here's what I currently do: -- check for pci error (call firmware & ask it) -- if (error) put event on a workqueue -- workqueue handler sends out a notifier_call_chain to anyone who cares. Thus, the error recovery thread runs in its own thread. Right now, the above code is arch-specific, but I don't see any reason we couldn't make the event, the workqueue and the notifier_call chain arch-generic. Below is some "pseudocode" version (mentally substitute "pci error event" for every occurance of "eeh"). Its got some ppc64-specific crud in there that we have to fix to make it truly generic (I just cut and pasted from current code). Would a cleaned up version of this code be suitable for a arch-generic pci error recovery framework? Seto, would this be useful to you? --linas =================================================== Header file: /** * EEH Notifier event flags. * Freeze -- pci slot is frozen, no i/o is possible */ #define EEH_NOTIFY_FREEZE 1 /** EEH event -- structure holding pci slot data that describes * a change in the isolation status of a PCI slot. A pointer * to this struct is passed as the data pointer in a notify callback. */ struct eeh_event { struct list_head list; struct pci_dev *dev; int reset_state; int time_unavail; }; /** Register to find out about EEH events. */ int eeh_register_notifier(struct notifier_block *nb); int eeh_unregister_notifier(struct notifier_block *nb); =================================================== C file: /* EEH event workqueue setup. */ static spinlock_t eeh_eventlist_lock = SPIN_LOCK_UNLOCKED; LIST_HEAD(eeh_eventlist); static void eeh_event_handler(void *); DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL); static struct notifier_block *eeh_notifier_chain; /** * eeh_register_notifier - Register to find out about EEH events. * @nb: notifier block to callback on events */ int eeh_register_notifier(struct notifier_block *nb) { return notifier_chain_register(&eeh_notifier_chain, nb); } /** * eeh_unregister_notifier - Unregister to an EEH event notifier. * @nb: notifier block to callback on events */ int eeh_unregister_notifier(struct notifier_block *nb) { return notifier_chain_unregister(&eeh_notifier_chain, nb); } /** * queue up a pci error event to be dispatched to all listeners * of the pci error notifier call chain. This routine is safe to call * within an interrupt context. The actual event delivery * will be from a workque thread. */ void eeh_queue_failure(struct pci_dev *dev) { struct eeh_event *event; event = kmalloc(sizeof(*event), GFP_ATOMIC); if (event == NULL) { printk (KERN_ERR "EEH: out of memory, event not handled\n"); return 1; } event->dev = dev; event->reset_state = rets[0]; event->time_unavail = rets[2]; /* We may be called in an interrupt context */ spin_lock_irqsave(&eeh_eventlist_lock, flags); list_add(&event->list, &eeh_eventlist); spin_unlock_irqrestore(&eeh_eventlist_lock, flags); /* Most EEH events are due to device driver bugs. Having * a stack trace will help the device-driver authors figure * out what happened. So print that out. */ if (rets[0] != 5) dump_stack(); schedule_work(&eeh_event_wq); } /** * eeh_event_handler - dispatch EEH events. The detection of a frozen * slot can occur inside an interrupt, where it can be hard to do * anything about it. The goal of this routine is to pull these * detection events out of the context of the interrupt handler, and * re-dispatch them for processing at a later time in a normal context. * * @dummy - unused */ static void eeh_event_handler(void *dummy) { unsigned long flags; struct eeh_event *event; while (1) { spin_lock_irqsave(&eeh_eventlist_lock, flags); event = NULL; if (!list_empty(&eeh_eventlist)) { event = list_entry(eeh_eventlist.next, struct eeh_event, list); list_del(&event->list); } spin_unlock_irqrestore(&eeh_eventlist_lock, flags); if (event == NULL) break; if (event->reset_state != 5) { printk(KERN_INFO "EEH: MMIO failure (%d), notifiying device " "%s %s\n", event->reset_state, pci_name(event->dev), pci_pretty_name(event->dev)); } __get_cpu_var(slot_resets)++; notifier_call_chain (&eeh_notifier_chain, EEH_NOTIFY_FREEZE, event); pci_dev_put(event->dev); kfree(event); } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/