On Mon, Mar 6, 2017 at 7:52 AM, Koen Vandeputte <koen.vandepu...@ncentric.com> wrote: > > > On 2017-02-17 17:19, Koen Vandeputte wrote: >> >> >> >>> Koen, >>> >>> Can you try to disable MSI? I've seen issues with it in the past for >>> IMX6 and I typically leave it disabled as it doesn't buy us anything >>> and can instead hurt performance. If I recall, I think its now >>> 'required' by the IMX6 PCIe driver so it may take a kernel change to >>> disable it. Other than that, how does mainline 4.9 behave and what >>> card/chipset are you using? >>> >>> Tim >> >> >> Hi Tim, >> >> I will try with disabled MSI and let you know. >> The earliest time I see in my planning is next week Friday. >> >> fyi, I'm testing on 3 different Ventana boards: >> >> - GW5100 (dualcore - single PCIe) >> - GW5200 (dualcore - Dual PCIe) >> - GW5410 (quadcore - 6x PCIe) >> >> All 3 boards utilize a single MiktroTik R11e-5HnD radio (AR 9300 based) >>
Koen, Sorry for the late reply - I keep getting diverted elsewhere. When the IMX6 PCIe host controller uses MSI legacy interrupts stop working and thus any card/driver using legacy will not have functioning interrupts. I'm not sure what that list of card/drivers is that require legacy interrupts but I know ath9k is one of them and just verified it doesn't get any interrupts currently on LEDE master with 4.9. The Linux 4.5 kernel enables PCI_MSI by default for imx_v6_v7_defconfig (31e98e0d24cd2537a63e06e235e050a06b175df7) and the Linux 4.8 kernel additionally requires PCI_MSI to be enabled for IMX6 (3ee803641e76bea76ec730c80dcc64739a9919ff). I'm discussing this upstream as I don't think MSI should be enabled on IMX6. You can check the ath9k interrupts (grep ath9k /proc/interrupts) to see this - if you've got 0 interrupts after your radio is up and running you've hit this issue. You can do the following to hack out the requirement of MSI for the IMX6 PCIe host controller, then disable CONFIG_PCI_MSI is kernel config diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig index dfb8a69..31cf8ad 100644 --- a/drivers/pci/dwc/Kconfig +++ b/drivers/pci/dwc/Kconfig @@ -6,7 +6,6 @@ config PCIE_DW config PCIE_DW_HOST bool depends on PCI - depends on PCI_MSI_IRQ_DOMAIN select PCIE_DW config PCI_DRA7XX @@ -45,7 +44,6 @@ config PCI_IMX6 bool "Freescale i.MX6 PCIe controller" depends on PCI depends on SOC_IMX6Q - depends on PCI_MSI_IRQ_DOMAIN select PCIEPORTBUS select PCIE_DW_HOST >> >> Other issues seen so far compared to kernel 4.4: >> - A simple "reboot" doesn't work. UART output shows "Reboot failed" and >> the board stalls. Powercycle is needed This can occur on older revision boards where the PMIC is not reset on IMX6 watchdog reset and a watchdog reset (which is what is used on soft reboot) occurs when the CPU is above 800Mhz. Can you provide the serial number of the board you are seeing this on and verify that if you force the cpu to 800mhz (ie userspace cpufreq governor) prior to reset the issue does not occur? The work-around for this is to use the Gateworks System Controller watchdog to restart the board which does a full board power cycle, but I haven't had time to get that driver mainlined yet (and thus have also not submitted it to LEDE/OpenWrt). >> - UART DMA disabled is required to avoid some boot errors (I've made a >> custom backport from your upstream patch fixing this, but not submitted here >> yet) which boot error specifically? I don't know that I've seen it, but I can confirm that UART DMA needs to be disabled for RS485 to work (which is a more obscure case) which is why I've done it on our kernels. AFAIK there are still some issues upstream with IMX UART flow-control and mctrl_gpio. >> >> General issues in kernels 4.4 & 4.9 >> - Even using the latest UBI FS sources + using the Sync option in bootarg, >> files can get corrupted on a power cut. If the corrupted file is a boot >> file .. :) can you point me to documentation on this bootarg, i'm not familiar with it? >> >> >> >> Other than this it runs pretty stable :) >> > Tim, > > I found 1 more issue on 4.4 & 4.9 kernels: > > https://lists.debian.org/debian-arm/2016/02/msg00000.html > > I'm also seeing this on 4.4 kernel. > It can take up to a few days before it triggers normally, but I have a setup > running which reproduces this within a few hours. > > I've made a patch which increases the timeout in the FEC driver just for > testing .. but it still occurs causing the port to be disabled suddenly. > I've seen reports of this as well but usually it takes days of activity if/before it happens. The MDIO timeout in FEC is currently 3ms - what did you increase it to and are you certain it makes these issues go away? Perhaps we need to start a discussion about this on linux-net. I'm not clear if an MDIO read timeout should cause an interface to go down (or if some layer should retry). I'm also not clear why an MDIO read would not complete in 3ms. Tim _______________________________________________ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev