On Tue, Aug 01, 2017 at 03:11:52AM -0400, Keith Busch wrote: > We've encountered a particular platform that under some circumstances > always has the power fault detected status raised. The pciehp irq handler > would loop forever because it thinks it is handling new events when in > fact the power fault is not new. This patch fixes that by masking off > the power fault status from new events if the driver hasn't seen the > power fault clear from the previous handling attempt.
Can you say which platform this is? If this is a hardware defect, it'd be interesting to know where it happens. But I'm not sure we handle PCI_EXP_SLTSTA correctly. We basically have this: pciehp_isr() { pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &status); events = status & (<events we care about>); pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, events); <queue event handling> } The write to PCI_EXP_SLTSTA clears PCI_EXP_SLTSTA_PFD because it's RW1C. But we haven't done anything that would actually change the situation that caused a power fault, so I don't think it would be surprising if the hardware immediately reasserted it. So maybe this continual assertion of power fault is really a software bug, not a hardware problem? > Fixes: fad214b0aa72 ("PCI: pciehp: Process all hotplug events before looking > for new ones") > > Cc: <sta...@vger.kernel.org> # 4.9+ > Cc: Mayurkumar Patel <mayurkumar.pa...@intel.com> > Signed-off-by: Keith Busch <keith.bu...@intel.com> > --- > Resending due to send-email setup error; this patch may appear twice > for some. > > drivers/pci/hotplug/pciehp_hpc.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/hotplug/pciehp_hpc.c > b/drivers/pci/hotplug/pciehp_hpc.c > index 026830a..8ecbc13 100644 > --- a/drivers/pci/hotplug/pciehp_hpc.c > +++ b/drivers/pci/hotplug/pciehp_hpc.c > @@ -583,7 +583,9 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id) > * Slot Status contains plain status bits as well as event > * notification bits; right now we only want the event bits. > */ > - events = status & (PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD | > + events = status & (PCI_EXP_SLTSTA_ABP | > + (ctrl->power_fault_detected ? > + 0 : PCI_EXP_SLTSTA_PFD) | > PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_CC | > PCI_EXP_SLTSTA_DLLSC); > if (!events) > -- > 2.5.5 >