I'm stuck. I did find and fix a bug (a typo) in the GMAC where the
txtimeout would fire every 60 seconds, reinitializing the GMAC driver no
matter what– it was never reset properly. But I have having trouble making
further progress. Here's a summary of what I know:

   - I'm using tcpecho example and custom Python client. I also have a
   udpecho and custom Python client for it.
   - With net debug logging turned on, there's no problems– I can send as
   much data for as long as I want. Very slowly due to the logging.
   - With the logging turned off, or minimal custom logging, if I keep send
   sizes smaller than one ethernet packet, I can send as much data as I want,
   as long as I don't go too fast.
   - With the logging turned off, or minimal custom logging, if I send data
   quickly, or send data that's larger than one ethernet packet, the following
   will happen:
      - every so often, the GMAC DMA will send two consecutive packets at
      once, instead of one at a time. It shouldn't do that, but it
does. Without
      a fix, the driver leaks txbuffers because the second packet never gets
      given back to the GMAC hardware. I wrote some code to catch that case and
      handle it correctly so it doesn't leak txbuffers. But then...
      - eventually, TCP receive and transmit slows to a crawl, and then the
      NuttX box stops responding to requests (including ICMP pings)
      - in this state, when I print out debugging info on the rxbuffers,
      they are all owned by the software, and there's no buffers free
to receive
      data from the GMAC hardware. Things seem fine until very close
to the end,
      when suddenly there are no free rxbuffers.
      - udpecho will fail the same way if given enough data very quickly
   - tcpblaster will immediately hang the NuttX box the same way
   - The SAMA5D36 EMAC driver appears to have the same problems– I tested
   this with the 10/100 EMAC too.
   - tcpblaster and tcpecho perform flawlessly in the NuttX simulator

Does anyone have ideas on how do debug this problem? Or have ideas about
any questions I should be asking?

cheers
adam

On Thu, Feb 13, 2020 at 6:37 PM Adam Feuer <a...@starcat.io> wrote:

> I tried using the EMAC 10/100 ethernet just now with the KSZ8081 PHY using
> the latest nuttx master, with an updated config. It has the same problem
> the GMAC does– it hangs when I try to do TCP sends of more than 1446 bytes.
>
> I'm going to look into the DMA, MAC, and PHY stuff tomorrow. I am hoping
> that will be easier now that I can actually debug the tcpecho thread with
> OpenOCD's thread-awareness.
>
> -adam
>
> On Wed, Feb 12, 2020 at 7:29 PM Adam Feuer <a...@starcat.io> wrote:
>
>> Greg,
>>
>> Yeah I get what you are saying about the driver being quite well-used.
>> This is a strange problem. Thank you for the ideas about the MAC
>> configuration, the DMA configuration, and the PHY setup. I will look into
>> those. I didn't even consider that the PHY could be part of the problem,
>> but it is a complex little chip...
>>
>> I'll look into it more tomorrow and see what I can find out.
>>
>> cheers
>> adam
>>
>>
>> On Wed, Feb 12, 2020 at 7:03 PM Gregory Nutt <spudan...@gmail.com> wrote:
>>
>>>
>>> > The nuttx simulator running tcpecho doesn't appear to have any problem
>>> with
>>> > large TCP sends or large number of large TCP sends. So the problem with
>>> > tcpecho / tcpblaster on the SAMA5D36 seems to be in the SAMA5 code
>>> > somewhere.
>>> >
>>> > The GMAC driver fix that I have only works when net logging is turned,
>>> and
>>> > it tested it more thoroughly today. It definitely works. But I don't
>>> > understand how it works, or why it doesn't work at high speeds.
>>>
>>> If the TX packets are going into DMA memory but are are never sent, then
>>> there could only be a few things wrong:  The MAC configuration, the DMA
>>> configuration, or the PHY setup.  I would tend to think that the PHY
>>> setup would be the most suspicious. That driver (in its various forms on
>>> different Atmel parts) has been very well exercised over many years.
>>>
>>> That doesn't mean it is error free, just not the first place I would
>>> look.
>>>
>>>
>>
>> --
>> Adam Feuer <a...@starcat.io>
>>
>
>
> --
> Adam Feuer <a...@starcat.io>
>


-- 
Adam Feuer <a...@starcat.io>

Reply via email to