Re: TCP fast retransmit issues

2017-08-17 Thread Jeremy Harris
On 28/07/17 08:27, Willy Tarreau wrote: > I didn't want to suggest names but since you did it first ;-) Indeed it's > mostly on the same device that I've been bothered a lot by their annoying > randomization. I used to know by memory the exact command to type to disable > it, but I don't anymore (s

Re: TCP fast retransmit issues

2017-07-31 Thread Neal Cardwell
On Fri, Jul 28, 2017 at 6:54 PM, Neal Cardwell wrote: > On Wed, Jul 26, 2017 at 3:02 PM, Neal Cardwell wrote: >> On Wed, Jul 26, 2017 at 2:38 PM, Neal Cardwell wrote: >>> Yeah, it looks like I can reproduce this issue with (1) bad sacks >>> causing repeated TLPs, and (2) TLPs timers being pushed

Re: TCP fast retransmit issues

2017-07-28 Thread Neal Cardwell
On Wed, Jul 26, 2017 at 3:02 PM, Neal Cardwell wrote: > On Wed, Jul 26, 2017 at 2:38 PM, Neal Cardwell wrote: >> Yeah, it looks like I can reproduce this issue with (1) bad sacks >> causing repeated TLPs, and (2) TLPs timers being pushed out to later >> times due to incoming data. Scripts are att

Re: TCP fast retransmit issues

2017-07-28 Thread Willy Tarreau
On Fri, Jul 28, 2017 at 08:36:49AM +0200, Klavs Klavsen wrote: > The network guys know what caused it. > > Appearently on (atleast some) Cisco equipment the feature: > > TCP Sequence Number Randomization > > is enabled by default. I didn't want to suggest names but since you did it first ;-) In

Re: TCP fast retransmit issues

2017-07-27 Thread Christoph Paasch
Hello, On Wed, Jul 26, 2017 at 7:32 AM, Eric Dumazet wrote: > On Wed, 2017-07-26 at 15:42 +0200, Willy Tarreau wrote: >> On Wed, Jul 26, 2017 at 06:31:21AM -0700, Eric Dumazet wrote: >> > On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: >> > > the 192.168.32.44 is a Centos 7 box. >> > >> >

Re: TCP fast retransmit issues

2017-07-27 Thread Klavs Klavsen
The network guys know what caused it. Appearently on (atleast some) Cisco equipment the feature: TCP Sequence Number Randomization is enabled by default. It would most definetely be beneficial if Linux handled SACK "not working" better than it does - but then I might never have found the cu

Re: TCP fast retransmit issues

2017-07-26 Thread Neal Cardwell
On Wed, Jul 26, 2017 at 2:38 PM, Neal Cardwell wrote: > Yeah, it looks like I can reproduce this issue with (1) bad sacks > causing repeated TLPs, and (2) TLPs timers being pushed out to later > times due to incoming data. Scripts are attached. I'm testing a fix of only scheduling a TLP if (flag

Re: TCP fast retransmit issues

2017-07-26 Thread Neal Cardwell
On Wed, Jul 26, 2017 at 1:06 PM, Neal Cardwell wrote: > On Wed, Jul 26, 2017 at 12:43 PM, Neal Cardwell wrote: >> (2) It looks like there is a bug in the sender code where it seems to >> be repeatedly using a TLP timer firing 211ms after every ACK is >> received to transmit another TLP probe (a n

Re: TCP fast retransmit issues

2017-07-26 Thread Neal Cardwell
On Wed, Jul 26, 2017 at 12:43 PM, Neal Cardwell wrote: > (1) Because the connection negotiated SACK, the Linux TCP sender does > not get to its tcp_add_reno_sack() code to count dupacks and enter > fast recovery on the 3rd dupack. The sender keeps waiting for specific > packets to be SACKed that w

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 07:32:12AM -0700, Eric Dumazet wrote: > On Wed, 2017-07-26 at 15:42 +0200, Willy Tarreau wrote: > > On Wed, Jul 26, 2017 at 06:31:21AM -0700, Eric Dumazet wrote: > > > On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: > > > > the 192.168.32.44 is a Centos 7 box. > > >

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 04:25:29PM +0200, Klavs Klavsen wrote: > Thank you very much guys for your insight.. its highly appreciated. > > Next up for me, is waiting till the network guys come back from summer > vacation, and convince them to sniff on the devices in between to pinpoint > the culprit

Re: TCP fast retransmit issues

2017-07-26 Thread Eric Dumazet
On Wed, 2017-07-26 at 15:42 +0200, Willy Tarreau wrote: > On Wed, Jul 26, 2017 at 06:31:21AM -0700, Eric Dumazet wrote: > > On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: > > > the 192.168.32.44 is a Centos 7 box. > > > > Could you grab a capture on this box, to see if the bogus packets a

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 04:08:19PM +0200, Klavs Klavsen wrote: > Grabbed on both ends. > > http://blog.klavsen.info/fast-retransmit-problem-junos-linux (updated to new > dump - from client scp'ing) > http://blog.klavsen.info/fast-retransmit-problem-junos-linux-receiving-side > (receiving host) So

Re: TCP fast retransmit issues

2017-07-26 Thread Klavs Klavsen
Thank you very much guys for your insight.. its highly appreciated. Next up for me, is waiting till the network guys come back from summer vacation, and convince them to sniff on the devices in between to pinpoint the culprit :) Willy Tarreau skrev den 2017-07-26 16:18: On Wed, Jul 26, 2017

Re: TCP fast retransmit issues

2017-07-26 Thread Klavs Klavsen
Grabbed on both ends. http://blog.klavsen.info/fast-retransmit-problem-junos-linux (updated to new dump - from client scp'ing) http://blog.klavsen.info/fast-retransmit-problem-junos-linux-receiving-side (receiving host) Eric Dumazet skrev den 2017-07-26 15:31: On Wed, 2017-07-26 at 14:18 +0

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 06:31:21AM -0700, Eric Dumazet wrote: > On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: > > the 192.168.32.44 is a Centos 7 box. > > Could you grab a capture on this box, to see if the bogus packets are > sent by it, or later mangled by a middle box ? Given the hug

Re: TCP fast retransmit issues

2017-07-26 Thread Eric Dumazet
On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: > the 192.168.32.44 is a Centos 7 box. Could you grab a capture on this box, to see if the bogus packets are sent by it, or later mangled by a middle box ? > > Could you help me by elaborating on how to see why the "dup ack" (sack > blocks

Re: TCP fast retransmit issues

2017-07-26 Thread Klavs Klavsen
the 192.168.32.44 is a Centos 7 box. Could you help me by elaborating on how to see why the "dup ack" (sack blocks) are bogus? Thank you very much. I'll try to capture the same scp done on mac - and see if it also gets DUP ACK's - and how they look in comparison (since it works on Mac client

Re: TCP fast retransmit issues

2017-07-26 Thread Eric Dumazet
On Wed, 2017-07-26 at 13:07 +0200, Klavs Klavsen wrote: > Hi guys, > > Me and my colleagues have an annoying issue with our Linux desktops and > the company's Junos VPN. > > We connect with openconnect (some use the official Pulse client) - which > then opens up a tun0 device - and traffic runs