I turned the debugging output on – the stuff Greg showed me in the email
thread on simulator networking. So I have more data.

What appears to be happening is that the GMAC driver's tx timeout fires,
and the GMAC is restarted:

[  528.350000] sam_txtimeout_work: ERROR: Timeout!
[  528.350000] sam_ifdown: Taking the network down
[  528.350000] sam_ifup: Bringing up: 10.0.0.2
[  528.360000] sam_ifup: Initialize the GMAC

Then once that happens, all the remaining packets are transmitted. My
tcpecho client program receives them ok. This takes a few minutes. I guess
when I said "it hangs" I didn't wait long enough.

So now to figure out why the tx timeout is firing...

-adam

On Sun, Feb 9, 2020 at 10:26 PM Adam Feuer <a...@starcat.io> wrote:

> Just to follow up with some more precise data, I just did a few more tests.
>
> On commit c1f75af084010bd8a13b2481abc38d848cd545f2 I can send any number
> of 1446 byte (or less) TCP sends. 1447 or more hangs.
>
> -adam
>
> On Sun, Feb 9, 2020 at 9:14 PM Adam Feuer <a...@starcat.io> wrote:
>
>> Greg,
>>
>> I have write buffering enabled:
>> CONFIG_NET_WRITE_BUFFERS=y
>>
>> Here are my IOB settings:
>> CONFIG_MM_IOB=y
>> CONFIG_IOB_NBUFFERS=24
>> CONFIG_IOB_BUFSIZE=196
>> CONFIG_IOB_NCHAINS=24
>> CONFIG_IOB_THROTTLE=0
>>
>> I do have some more data. I did a manual git bisect to find out where
>> things started getting worse. I went back to Jan 1 and tested various
>> commits with a custom program that talks to tcpecho (binary search to zero
>> in on where the problems are).
>>
>> Commits before and after, up to the current tip, hang when I try to do
>> consecutive TCP sends. Since Jan 1, most of the commits can't do more than
>> 10 consecutive TCP sends of 1000 characters, that's a lot less than the MSS
>> of 1447.
>>
>> However, commit c1f75af084010bd8a13b2481abc38d848cd545f2 seems to work
>> the best. With that commit, as long as CONFIG_TIME_EXTENDED is not set, I
>> can send as many 1400 character TCP sends as I want... thousands in a row.
>> However, if I set CONFIG_TIME_EXTENDED=y, things stop working, I can only
>> do a few TCP sends before the tcpecho hangs.
>>
>> Note that through all this, if I try to send more than about 1440
>> characters in a single TCP send, tcpecho will hang. That isn't right but I
>> haven't tracked that problem down either.
>>
>> I tried to use a debugger and also syslog to trace what difference the
>> CONFIG_TIME_EXTENDED was making– why it causes the hang if it is turned on.
>> But I haven't figured that out yet. Do you have any ideas?
>>
>> Re: Wireshark, I have been using tcpdump extensively. The linux side
>> seems to send the first TCP packet ok. tcpecho can echo back if
>> CONFIG_TIME_EXTENDED is not set and the MSS is 1400 bytes or less. If MSS
>> is larger than 1447 tcpecho will send a few packets and then hang. I
>> haven't characterized the boundary (1400-1448 bytes). I also just figured
>> out the CONFIG_TIME_EXTENDED thing this afternoon so haven't looked at
>> tcpdump traces of it yet, but I will and report back.
>>
>> I can send some tcpdump traces if that would help. Would it?
>>
>> -adam
>>
>> On Sun, Feb 9, 2020 at 5:21 AM Gregory Nutt <spudan...@gmail.com> wrote:
>>
>>>
>>> > My question about MSS wasn't clear– what I mean is, to an application
>>> (like
>>> > tcpecho or tcpblaster), the size of MSS shouldn't matter. Shouldn't the
>>> > application be able to send as many bytes as it wants? The TCP stack
>>> > divides the stream up into chunks (of MSS length) and transmits them...
>>> > sending more than MSS bytes should not hang the application...?
>>> Yes, that is correct.
>>> > That's not what I"m seeing with the NuttX that I'm using
>>> (SAMA5D36-Xplained
>>> > using the gigabit ethernet port, built from the latest master).
>>> Do you have write buffering enabled?  Do you have enough IOBs
>>> pre-allocated?  Have you tried looking at the traffic with WireShark?
>>>
>>>
>>>
>>
>> --
>> Adam Feuer <a...@starcat.io>
>>
>
>
> --
> Adam Feuer <a...@starcat.io>
>


-- 
Adam Feuer <a...@starcat.io>

Reply via email to