Alexey Kardashevskiy <a...@ozlabs.ru> writes: > The current implementation of the OPAL_PCI_EEH_FREEZE_STATUS call in > skiboot's NPU driver does not touch the pci_error_type parameter so > it might have garbage but the powernv code analyzes it nevertheless. > > This initializes pcierr and fstate to zero in all call sites. > > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > ---
Can we tag this with a Fixes? And seems like it should probably go to stable, or can we not trigger this path on older kernels? cheers > Without this, this happens: > > pnv_eeh_get_phb_diag: Failure -7 getting PHB#6 diag-data > EEH: PHB#6 failure detected, location: N/A > CPU: 23 PID: 5939 Comm: qemu-system-ppc Not tainted > 4.19.0-le_f5a7bb7_aikATfstn1-p1 torvalds#106 > Call Trace: > [c000003fea9df9c0] [c000000000a990ec] dump_stack+0xb0/0xf4 (unreliable) > [c000003fea9dfa00] [c000000000038480] eeh_dev_check_failure+0x1f0/0x5f0 > [c000003fea9dfaa0] [c0000000000a2768] pnv_pci_read_config+0x128/0x160 > [c000003fea9dfae0] [c0000000005d2b0c] pci_bus_read_config_dword+0x9c/0xf0 > [c000003fea9dfb40] [c0000000005df3d4] pci_save_state+0x64/0x250 > [c000003fea9dfbc0] [c0000000005e0730] pci_dev_save_and_disable+0x70/0xa0 > [c000003fea9dfbf0] [c0000000005e4078] pci_try_reset_function+0x48/0xc0 > [c000003fea9dfc20] [c00800001cbc1b1c] vfio_pci_ioctl+0x334/0xea0 [vfio_pci] > [c000003fea9dfcf0] [c00800001ca9046c] vfio_device_fops_unl_ioctl+0x44/0x70 > [vfio] > [c000003fea9dfd10] [c00000000039fd84] do_vfs_ioctl+0xd4/0xa00 > [c000003fea9dfdb0] [c0000000003a07b4] ksys_ioctl+0x104/0x120 > [c000003fea9dfe00] [c0000000003a07f8] sys_ioctl+0x28/0x80 > [c000003fea9dfe20] [c00000000000b3a4] system_call+0x5c/0x70 > EEH: Detected error on PHB#6 > EEH: This PCI device has failed 1 times in the last hour and will be > permanently disabled after 5 fail > ures. > EEH: Notify device drivers to shutdown > EEH: Beginning: 'error_detected(IO frozen)' > EEH: PE#d (PCI 0006:00:00.0): not actionable (1,1,0) > EEH: PE#d (PCI 0006:00:00.1): not actionable (1,1,0) > EEH: PE#c (PCI 0006:00:01.0): Invoking vfio-pci->error_detected(IO frozen) > EEH: PE#c (PCI 0006:00:01.0): vfio-pci driver reports: 'can recover' > EEH: PE#c (PCI 0006:00:01.1): Invoking vfio-pci->error_detected(IO frozen) > EEH: PE#c (PCI 0006:00:01.1): vfio-pci driver reports: 'can recover' > EEH: PE#b (PCI 0006:00:02.0): Invoking vfio-pci->error_detected(IO frozen) > EEH: PE#b (PCI 0006:00:02.0): vfio-pci driver reports: 'can recover' > EEH: PE#b (PCI 0006:00:02.1): Invoking vfio-pci->error_detected(IO frozen) > EEH: PE#b (PCI 0006:00:02.1): vfio-pci driver reports: 'can recover' > EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'can > recover' > EEH: Collect temporary log > pnv_pci_dump_phb_diag_data: Unrecognized ioType 0 > EEH: Reset without hotplug activity > iommu: Removing device 0006:00:01.0 from group 4 > iommu: Removing device 0006:00:01.1 from group 4 > iommu: Removing device 0006:00:02.0 from group 4 > iommu: Removing device 0006:00:02.1 from group 4 > pnv_ioda_freeze_pe: Failure -7 freezing PHB#6-PE#0 > pnv_eeh_restore_config: Can't reinit PCI dev 0x0 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x1 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x8 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x9 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x10 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x11 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x0 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x1 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x8 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x9 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x10 (-7) > pnv_eeh_restore_config: Can't reinit PCI dev 0x11 (-7) > EEH: Sleep 5s ahead of partial hotplug > pci 0004:04 : [PE# 00] Setting up window#0 0..3fffffff pg=1000 > pci 0004:05 : [PE# 18] Setting up window#0 0..3fffffff pg=1000 > pci 0004:06 : [PE# 30] Setting up window#0 0..3fffffff pg=1000 > pci 0006:00:00.0: [PE# 0d] Setting up window 0..3fffffff pg=1000 > pci 0006:00:01.0: [PE# 0c] Setting up window 0..3fffffff pg=1000 > pci 0006:00:02.0: [PE# 0b] Setting up window 0..3fffffff pg=1000 > EEH: Beginning: 'slot_reset' > EEH: PE#d (PCI 0006:00:00.0): not actionable (1,1,0) > EEH: PE#d (PCI 0006:00:00.1): not actionable (1,1,0) > EEH: Finished:'slot_reset' with aggregate recovery state:'none' > EEH: Notify device driver to resume > EEH: Beginning: 'resume' > EEH: PE#d (PCI 0006:00:00.0): not actionable (1,1,0) > EEH: PE#d (PCI 0006:00:00.1): not actionable (1,1,0) > EEH: Finished:'resume' > EEH: Recovery successful. > --- > arch/powerpc/platforms/powernv/eeh-powernv.c | 8 ++++---- > arch/powerpc/platforms/powernv/pci-ioda.c | 4 ++-- > arch/powerpc/platforms/powernv/pci.c | 4 ++-- > 3 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c > b/arch/powerpc/platforms/powernv/eeh-powernv.c > index abc0be7..f380789 100644 > --- a/arch/powerpc/platforms/powernv/eeh-powernv.c > +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c > @@ -564,8 +564,8 @@ static void pnv_eeh_get_phb_diag(struct eeh_pe *pe) > static int pnv_eeh_get_phb_state(struct eeh_pe *pe) > { > struct pnv_phb *phb = pe->phb->private_data; > - u8 fstate; > - __be16 pcierr; > + u8 fstate = 0; > + __be16 pcierr = 0; > s64 rc; > int result = 0; > > @@ -603,8 +603,8 @@ static int pnv_eeh_get_phb_state(struct eeh_pe *pe) > static int pnv_eeh_get_pe_state(struct eeh_pe *pe) > { > struct pnv_phb *phb = pe->phb->private_data; > - u8 fstate; > - __be16 pcierr; > + u8 fstate = 0; > + __be16 pcierr = 0; > s64 rc; > int result; > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > b/arch/powerpc/platforms/powernv/pci-ioda.c > index dd80744..72b5cc0 100644 > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > @@ -604,8 +604,8 @@ static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int > pe_no, int opt) > static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no) > { > struct pnv_ioda_pe *slave, *pe; > - u8 fstate, state; > - __be16 pcierr; > + u8 fstate = 0, state; > + __be16 pcierr = 0; > s64 rc; > > /* Sanity check on PE number */ > diff --git a/arch/powerpc/platforms/powernv/pci.c > b/arch/powerpc/platforms/powernv/pci.c > index 13aef23..db230a35 100644 > --- a/arch/powerpc/platforms/powernv/pci.c > +++ b/arch/powerpc/platforms/powernv/pci.c > @@ -602,8 +602,8 @@ static void pnv_pci_handle_eeh_config(struct pnv_phb > *phb, u32 pe_no) > static void pnv_pci_config_check_eeh(struct pci_dn *pdn) > { > struct pnv_phb *phb = pdn->phb->private_data; > - u8 fstate; > - __be16 pcierr; > + u8 fstate = 0; > + __be16 pcierr = 0; > unsigned int pe_no; > s64 rc; > > -- > 2.17.1