On Fri, 2017-11-03 at 11:13 -0400, Vitaly Davidovich wrote: > Ok, an interesting finding. The client was originally running with > SO_RCVBUF of 75K (apparently someone decided to set that for some > unknown reason). I tried the test with a 1MB recv buffer and > everything works perfectly! The client responds with 0 window alerts, > the server just hits the persist condition and sends keep-alive > probes; the client continues answering with a 0 window up until it > wakes up and starts processing data in its receive buffer. At that > point, the window opens up and the server sends more data. Basically, > things look as one would expect in this situation :). > > /proc/sys/net/ipv4/tcp_rmem is 131072 1048576 20971520. The > conversation flows normally, as described above, when I change the > client's recv buf size to 1048576. I also tried 131072, but that > doesn't work - same retrans/no ACKs situation. > > I think this eliminates (right?) any middleware from the equation. > Instead, perhaps it's some bad interaction between a low recv buf size > and either some other TCP setting or TSO mechanics (LRO specifically). > Still investigating further.
Just in case, have you tried a more recent linux kernel ? I would rather not spend time on some problem that might already be fixed.