On 2/22/18 5:58 AM, Vaibhav Jain wrote: > This patch puts a NULL check before branching to the address pointed > to by eeh_ops->notify_resume in eeh_report_resume(). The callback > is used to notify the arch EEH code that a pci device is back > online. > > For PPC64 presently, only an implementation for pseries platform is > available and not for powernv. Hence without this patch EEH recovery > on all non-virtualized hosts is causing a kernel panic when > CONFIG_PCI_IOV is set. The panic is usually is of the form: > > EEH: Notify device driver to resume > Unable to handle kernel paging request for instruction fetch > Faulting instruction address: 0x00000000 > Oops: Kernel access of bad area, sig: 11 [#1] > <snip> > LR eeh_report_resume+0x218/0x220 > Call Trace: > eeh_report_resume+0x1f0/0x220 (unreliable) > eeh_pe_dev_traverse+0x98/0x170 > eeh_handle_normal_event+0x3f4/0x650 > eeh_handle_event+0x188/0x380 > eeh_event_handler+0x208/0x210 > kthread+0x168/0x1b0 > ret_from_kernel_thread+0x5c/0xb4 > > Cc: Bryant G. Ly <bryan...@linux.vnet.ibm.com> > Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume") > Signed-off-by: Vaibhav Jain <vaib...@linux.vnet.ibm.com> > --- > arch/powerpc/kernel/eeh_driver.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/eeh_driver.c > b/arch/powerpc/kernel/eeh_driver.c > index beea2182d754..932858a293ea 100644 > --- a/arch/powerpc/kernel/eeh_driver.c > +++ b/arch/powerpc/kernel/eeh_driver.c > @@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata) > eeh_pcid_put(dev); > pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED); > #ifdef CONFIG_PCI_IOV > - eeh_ops->notify_resume(eeh_dev_to_pdn(edev)); > + if (eeh_ops->notify_resume) > + eeh_ops->notify_resume(eeh_dev_to_pdn(edev)); > #endif > return NULL; > }
A version of this patch already upstreamed. https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=fixes&id=521ca5a9859a870e354d1a6b84a6ff4c07bbceb0 -Bryant