On Thu, Nov 9, 2017 at 5:21 PM, Kristian Evensen <kristian.even...@gmail.com> wrote: > I have been hammering away on this issue during the day, and it seems > that the DMA engine, TX, etc. works just fine. However, for some > reason, the port with the router that has hung is able to stop the > whole switch. If I disable the port (or disconnect the cable), then TX > works again and I can for example reach 192.168.1.1 from 192.168.1.2 > in my testbed. When running ping (from 192.168.1.2 to 192.168.1.1) > while disconnecting the cable, the first packets had a very high RTT > (~20ms). Running tcpdump showed that the reply arrived immediately, so > it seems the packets are stuck in a TX buffer for a really long time. > Could it be that there is a cache or something internally on the > switch that is causing packets to be held back, and that this cache is > invalidated and buffers flushed when I disable the port? I cleared > switch, DIP and SIP tables without any effect. > > If I enable the port, then the problem appears again after a little > while (~30 seconds).
I replaced the 3526 with other devices containing the mt7530 switch (both mt7621 and mt7623-based boards), and the issues seems to be related to the switch rather than the SoC. I am able to reliably trigger the timeout on all devices I have tested, both running proprietary drivers/firmware and LEDE. I guess this points to that there is some traffic pattern or network behavior that triggers an error in the MT7530 and causes TX to freeze. Restarting the ports makes the switch work again, but as long as the "bad" device is connected to the mt7530 then it is just a matter of time before the timeout is back. -Kristian _______________________________________________ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev