Ok, thanks. I did do that yesterday with tcpdump while trying to find the closing-early problem with the header lengths. Now with the header length calculation things look good– at low speeds, the packet exchange matches what I am getting from a Linux web server. No retransmissions, resets, or missed ACKs. I will try a high-speed capture and examine it. I did try tcpblaster last night, and can only get about 100 KB/s maximum – about 800kbps, so much less than the 480Mbps that's possible. So that is probably part of the constraint. The SAMA5D36 has USB DMA.so it should be possible to get 100Mbps or higher... I will try with the Gigabit Ethernet and see if I can find the bottleneck.
This might be useful to you: https://cwiki.apache.org/confluence/display/NUTTX/TCP+Network+Performance
TCP blaster can be set up for sending to or receiving from the device. Which are you doing?
I have verified NuttX TCP Tx performance with tcpblaster and found that it runs at about the maximum for the network I was using (I was using a slow, half duplex network). You are probably not being limited by the USB transfer rate, but by the network transfer rate. Well both should have and effect, but the (probably) network transfer rate at 100Mbit/sec is a lot lower than the USB transfer rate.
100KB/s is 800Kbit/sec which is considerably lower than the network transfer rate.
Rx performance is more difficult to characterize.