On Sun, 6 Dec 2020 10:20:41 +0100 Willy Tarreau wrote: > Hi Jakub, > > Two months ago I implemented a small change in the macb driver to > support the two Tx descriptors that AT91RM9200 supports. I implemented > this using the only compatible device I had which is the MSC313E-based > Breadbee board. Since then I've met situations where the chip would stop > sending, mostly when doing bidirectional traffic. I've spent a few week- > ends on this already trying various things including blocking interrupts > on a larger place, switching to the 32-bit register access on the MSC313E > (previous code was using two 16-bit accesses and I thought it was the > cause), and tracing status registers along the various operations. Each > time the pattern looks similar, a write when the chips declares having > room results in an overrun, but the preceeding condition doesn't leave > any info suggesting this may happen. > > Sadly we don't have the datasheet for this SoC, what is visible is that it > supports AT91RM9200's tx mode and that it works well when there's a single > descriptor in flight. In this case it complies with AT91RM9200's datasheet. > The chip reports other undocumented bits in its status registers, that > cannot even be correlated to the issue by the way. I couldn't spot anything > wrong there in my changes, even after doing heavy changes to progress on > debugging, and the SoC's behavior reporting an overrun after a single write > when there's room contradicts the RM9200's datasheet. In addition we know > the chip also supports other descriptor-based modes, so it's very possible > it doesn't perfectly implement the RM9200's 2-desc mode and that my change > is still fine. > > Yesterday I hope to get my old AT91SAM9G20 board to execute this code and > test it, but it clearly does not work when I remap the init and tx functions, > which indicates that it really does not implement a hidden compatibility > mode with the old chip. > > Thus at this point I have no way to make sure that the original SoC for > which I changed the code still works fine or if I broke it. As such, I'd > feel more comfortable if we'd revert my patch until someone has the > opportunity to test it on the original hardware (Alexandre suggested he > might, but later). > > The commit in question is the following one: > > 0a4e9ce17ba7 ("macb: support the two tx descriptors on at91rm9200") > > If the maintainers prefer that we wait for an issue to be reported before > reverting it, that's fine for me as well. What's important to me is that > this potential issue is identified so as not to waste someone else's time > on it.
Thanks for the report, I remember that one. In hindsight maybe we should have punted it to 5.11... Let's revert ASAP, 5.10 is going to be LTS, people will definitely notice. Would you mind sending a revert patch with the explanation in the commit message?