Re: [PATCH] eeh: Fixing a bug when pci structure is null

2010-02-24 Thread Mike Mason
On 2/19/2010 1:54 PM, Benjamin Herrenschmidt wrote: On Fri, 2010-02-19 at 14:43 -0200, Breno Leitao wrote: Hi Ben, I'd like to ask about this patch ? Should I re-submit ? Thanks, Breno Leitao wrote: During a EEH recover, the pci_dev structure can be null, mainly if an eeh event is detected d

[PATCH 3/3] Support for PCI Express reset type

2009-07-30 Thread Mike Mason
reviously submitted patch that implemented a fundamental reset bit field. Please review and let me know of any concerns. Signed-off-by: Mike Mason Signed-off-by: Richard Lary diff -uNrp a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c --- a/arch/powerpc/kernel/pci_64.c 2009-07-13

[PATCH 2/3] Support for PCI Express reset type

2009-07-30 Thread Mike Mason
s new bit field, as well a few unrelated updates. These patches supersede the previously submitted patch that implemented a fundamental reset bit field. Please review and let me know of any concerns. Signed-off-by: Mike Mason Signed-off-by: Richard Lary --- a/Documentation/PCI/pci-error-

[PATCH 1/3] Support for PCI Express reset type

2009-07-30 Thread Mike Mason
device requires a fundamental reset during recovery. These patches supersede the previously submitted patch that implemented a fundamental reset bit field. Please review and let me know of any concerns. Signed-off-by: Mike Mason Signed-off-by: Richard Lary diff -uNrp a/include/linux/pci.h b/inc

Re: [PATCH] Hold reference to device_node during EEH event handling

2009-07-22 Thread Mike Mason
Michael Ellerman wrote: On Thu, 2009-07-16 at 09:33 -0700, Mike Mason wrote: Michael Ellerman wrote: On Wed, 2009-07-15 at 14:43 -0700, Mike Mason wrote: This patch increments the device_node reference counter when an EEH error occurs and decrements the counter when the event has been handled

Re: [PATCH] Hold reference to device_node during EEH event handling

2009-07-16 Thread Mike Mason
Michael Ellerman wrote: On Wed, 2009-07-15 at 14:43 -0700, Mike Mason wrote: This patch increments the device_node reference counter when an EEH error occurs and decrements the counter when the event has been handled. This is to prevent the device_node from being released until

[PATCH] Hold reference to device_node during EEH event handling

2009-07-15 Thread Mike Mason
e the device_node is released too soon when an EEH event occurs during a dlpar remove, causing the event handler to attempt to access bad memory locations. Please review and let me know of any concerns. Signed-off-by: Mike Mason --- a/arch/powerpc/platforms/pseries/eeh_event.c2008-10-0

[PATCH] Support for PCI Express reset type in EEH

2009-07-15 Thread Mike Mason
that implemented a reset type callback. Please review and let me know of any concerns. Signed-off-by: Mike Mason diff -uNrp a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c --- a/arch/powerpc/kernel/pci_64.c 2009-07-13 14:25:24.0 -0700 +++ b/arch/powerpc/kernel/

Re: Support for PCI Express reset type in EEH

2009-07-15 Thread Mike Mason
This patch was simultaneously submitted to Red Hat for review. As a result of that review, I'm withdrawing this patch and will submit a new version shortly. Mike Mike Mason wrote: By default, EEH does what's known as a "hot reset" during error recovery of a PCI Express de

Support for PCI Express reset type in EEH

2009-07-14 Thread Mike Mason
t need to be changed. So far we're only aware of one driver that has the requirement (qla2xxx). The patch touches mostly EEH and pseries code, but does require a couple of minor additions to the overall PCI error recovery framework. Signed-off-by: Mike Mason --- a/arch/powerpc/include/as

Re: [PATCH] Set error_state to pci_channel_io_normal in eeh_report_reset()

2009-04-21 Thread Mike Mason
Paul Mackerras wrote: Mike Mason writes: diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c index 380420f..9a2a6e3 100644 --- a/arch/powerpc/platforms/pseries/eeh_driver.c +++ b/arch/powerpc/platforms/pseries/eeh_driver.c

[PATCH] Set error_state to pci_channel_io_normal in eeh_report_reset()

2009-04-10 Thread Mike Mason
ate of the hardware is Normal Operations and should be accurately reflected by setting dev->error_state to pci_channel_io_normal. The current implementation of EEH driver does not do so and requires the following patch to correct this deficiency. Signed-off-by: Mike Mason diff --git a/arc

Re: [PATCH] Only disable/enable LSI interrupts in EEH

2009-02-12 Thread Mike Mason
Michael Ellerman wrote: On Tue, 2009-02-10 at 13:12 -0800, Mike Mason wrote: I'm resubmitting this patch with a couple changes suggested by Michael Ellerman. 1) the new functions should be static, and 2) some people may object to including unrelated formating ch

Re: [PATCH] Only disable/enable LSI interrupts in EEH

2009-02-10 Thread Mike Mason
tracked in a different way than LSI/MSI interrupts. This patch ensures only LSI interrupts are disabled/enabled. Signed-off-by: Mike Mason Acked-by: Linas Vepstas --- arch/powerpc/platforms/pseries/eeh_driver.c-orig2009-02-10 07:12:31.0 -0800 +++ arch/powerpc/platfor

[PATCH] Only disable/enable LSI interrupts in EEH

2009-02-09 Thread Mike Mason
take into account that MSI-X interrupts are tracked in a different way than LSI/MSI interrupts. This patch ensures only LSI interrupts are disabled/enabled. The patch also includes a couple minor formatting fixes. Signed-off-by: Mike Mason --- linux-2.6.18.ppc64-orig/arch/powerpc/plat

Re: [PATCH] Don't panic when EEH_MAX_FAILS is exceeded

2008-07-21 Thread Mike Mason
nning, which at the very least should allow for a more graceful shutdown. The patch also removes the msleep() within a spinlock, which can lead to a deadlock and is not recommended. Signed-off-by: Mike Mason <[EMAIL PROTECTED]> Acked-by: Linas Vepstas <[EMAIL PROTECTED]> --- powerpc

[PATCH] Don't panic when EEH_MAX_FAILS is exceeded

2008-07-20 Thread Mike Mason
shutdown. The panic() is now wrapped in a DEBUG statement for development purposes. The patch also removes the msleep() within a spinlock, which is not allowed. Signed-off-by: Mike Mason <[EMAIL PROTECTED]> --- powerpc.git/arch/powerpc/platforms/pseries/eeh.c2008-07-18

Re: [PATCH] Restore PERR/SERR bit settings during EEH device recovery

2008-07-08 Thread Mike Mason
e, but are not restored to 1 during EEH recovery. The patch fixes the Agilent card problem. It has been tested on several other EEH-enabled cards with no regressions. Signed-off-by: Mike Mason <[EMAIL PROTECTED]> Acked-by: Linas Vepstas <[EMAIL PROTECTED]> --- linux-2.6.26-rc9/

Re: [PATCH] Restore PERR/SERR bit settings during EEH device recovery

2008-07-08 Thread Mike Mason
Linas Vepstas wrote: 2008/7/7 Mike Mason <[EMAIL PROTECTED]>: The following patch restores the PERR and SERR bits in the PCI command register during an EEH device recovery. We have found at least one case (an Agilent test card) where the PERR/SERR bits are set to 1 by firmware at boot tim

[PATCH] Restore PERR/SERR bit settings during EEH device recovery

2008-07-07 Thread Mike Mason
Agilent card problem. It has been tested on several other EEH-enabled cards with no regressions. Signed-off-by: Mike Mason <[EMAIL PROTECTED]> --- linux-2.6.26-rc9/arch/powerpc/platforms/pseries/eeh.c 2008-07-07 16:06:57.0 -0700 +++ linux-2.6.26-rc9-new/arch/powerpc/pla