On Tue, Jan 19, 2016 at 06:18:19PM -0200, Guilherme G. Piccoli wrote: >The function eeh_add_device_early() is used to perform EEH initialization in >devices added later on the system, like in hotplug/DLPAR scenarios. Since the >commit 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on >Cell") >a new check was introduced in this function - Cell has no EEH capabilities >which led to kernel oops if hotplug was performed, so checking for >eeh_enabled() was introduced to avoid the issue. > >However, in architectures that EEH is present like pSeries or PowerNV, we might >reach a case in which no PCI devices are present on boot and so EEH is not >initialized. Then, if a device is added via DLPAR for example, >eeh_add_device_early() fails because eeh_enabled() is false. > >Also, we can hit a kernel oops on pSeries arch if eeh_add_device_early() fails: >if we have no PCI devices on machine at boot time, and then we add a PCI device >via DLPAR operation, the function query_ddw() triggers the oops on NULL pointer >dereference in the line "cfg_addr = edev->config_addr;". It happens because >config_addr in edev is NULL, since the function eeh_add_device_early() was not >completed successfully. > >This patch just changes the way the arch checking is done in function >eeh_add_device_early(): we don't use eeh_enabled() anymore, but instead we >introduce the function eeh_available() that checks the running architecture >by using the macro machine_is(). If we are running on pSeries or PowerNV, the >EEH mechanism is available (even if not initialized yet). This way, we don't >try to enable EEH on Cell and we don't hit the oops on DLPAR either. > >Fixes: 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on >Cell") >Signed-off-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com> >--- > arch/powerpc/kernel/eeh.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > >diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c >index 40e4d4a..69031d7 100644 >--- a/arch/powerpc/kernel/eeh.c >+++ b/arch/powerpc/kernel/eeh.c >@@ -1056,6 +1056,23 @@ int eeh_init(void) > core_initcall_sync(eeh_init); > > /** >+ * eeh_available - Checks for the availability of EEH based on running >+ * architecture. >+ * >+ * This routine should be used in case we need to check if EEH is >+ * available in some situation, regardless if EEH is enabled or not. >+ * For example, if we hotplug-add a PCI device on a machine with no >+ * other PCI device, EEH won't be enabled, yet it's available if the >+ * arch supports it. >+ */ >+static inline bool eeh_available(void) >+{ >+ if (machine_is(pseries) || machine_is(powernv)) >+ return true; >+ return false; >+} >+
As I was told by somebody else before, the comments for static function needn't to be exported. >+/** > * eeh_add_device_early - Enable EEH for the indicated device node > * @pdn: PCI device node for which to set up EEH > * >@@ -1072,7 +1089,7 @@ void eeh_add_device_early(struct pci_dn *pdn) > struct pci_controller *phb; > struct eeh_dev *edev = pdn_to_eeh_dev(pdn); > >- if (!edev || !eeh_enabled()) >+ if (!edev || !eeh_available()) > return; > > if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE)) The change here seems not correct enough. Before the changes, eeh_add_device_early() does nothing if EEH is disabled on pSeries. With the changes applied, the EEH device (edev) will be scanned even EEH is disabled on pSeries. From the code changes, I didn't figure out the real problem you try to fix. Cell platform doesn't have flag EEH_PROBE_MODE_DEVTREE. So the function does nothing on Cell platform except calling into pdn_to_eeh_dev(). I'm not sure if the kernel crashed in pdn_to_eeh_dev() on Cell platform? Thanks, Gavin _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev