On Fri, May 30, 2025 at 09:39:28PM +0530, Manivannan Sadhasivam wrote: > On Fri, May 30, 2025 at 06:34:04AM -0500, Bjorn Helgaas wrote: > > On Fri, May 30, 2025 at 09:16:59AM +0530, Manivannan Sadhasivam wrote: > > > On Wed, May 28, 2025 at 05:35:00PM -0500, Bjorn Helgaas wrote: > > > > On Thu, May 08, 2025 at 12:40:33PM +0530, Manivannan Sadhasivam wrote: > > > > > The PCI link, when down, needs to be recovered to bring it back. But > > > > > that > > > > > cannot be done in a generic way as link recovery procedure is > > > > > specific to > > > > > host bridges. So add a new API pci_host_handle_link_down() that could > > > > > be > > > > > called by the host bridge drivers when the link goes down. > > > > > > > > > > The API will iterate through all the slots and calls the > > > > > pcie_do_recovery() > > > > > function with 'pci_channel_io_frozen' as the state. This will result > > > > > in the > > > > > execution of the AER Fatal error handling code. Since the link down > > > > > recovery is pretty much the same as AER Fatal error handling, > > > > > pcie_do_recovery() helper is reused here. First the AER error_detected > > > > > callback will be triggered for the bridge and the downstream devices. > > > > > Then, > > > > > pci_host_reset_slot() will be called for the slot, which will reset > > > > > the > > > > > slot using 'reset_slot' callback to recover the link. Once that's > > > > > done, > > > > > resume message will be broadcasted to the bridge and the downstream > > > > > devices > > > > > indicating successful link recovery. > > > > > > > > Link down is an event for a single Root Port. Why would we iterate > > > > through all the Root Ports if the link went down for one of them? > > > > > > Because on the reference platform (Qcom), link down notification is > > > not per-port, but per controller. So that's why we are iterating > > > through all ports. The callback is supposed to identify the ports > > > that triggered the link down event and recover them. > > > > Maybe I'm missing something. Which callback identifies the port(s) > > that triggered the link down event? > > I was referring to the host_bridge::reset_root_port() callback that resets the > root ports. > > > I see that > > pci_host_handle_link_down() is called by > > rockchip_pcie_rc_sys_irq_thread() and qcom_pcie_global_irq_thread(), > > but I don't see the logic that identifies a particular Root Port. > > > > Per-controller notification of per-port events is a controller > > deficiency, not something inherent to PCIe. I don't think we should > > build common infrastructure that resets all the Root Ports just > > because one of them had an issue. > > Hmm, fair enough. > > > I think pci_host_handle_link_down() should take a Root Port, not a > > host bridge, and the controller driver should figure out which port > > needs to be recovered, or the controller driver can have its own loop > > to recover all of them if it can't figure out which one needs it. > > This should also work. Feel free to drop the relevant commits for > v6.16, I can resubmit them (including dw-rockchip after -rc1).
OK, I kept "PCI: host-common: Make the driver as a common library for host controller drivers" (renamed to "PCI: host-common: Convert to library for host controller drivers") on pci/controller/dw-rockchip for v6.16 and deferred the rest until later.