On Tue, 2015-05-26 at 23:23 -0700, Eric Dumazet wrote: > On Tue, 2015-05-26 at 23:12 -0700, Eric Dumazet wrote: > > On Tue, 2015-05-26 at 22:47 -0700, Gopakumar Choorakkot Edakkunni wrote: > > > All, > > > > > > The original query I had posted is here : > > > http://stackoverflow.com/questions/30414350/tcp-syn-ack-tsecr-not-matching-tsval-in-syn > > > .. The summary is that once in a while, the TSval in SYN is not what > > > is getting echoed in TSecr, and looks like something on amazon aws > > > side is very strict about that and drops those packets. Any clues on > > > this - whether its a known issue/fixed elsewhere etc. would be of > > > great help. > > > > I guess that if you send SYN packets 3 times as your email did on > > netdev, that might cause some issues... > > > > More seriously, server has a SYN_RECV socket with same tuple, because of > > a SYN sent earlier : > > > > 8:36:00.593136 IP XX.YY.ZZ.VV.24548 > AA.BB.CC.DD.443: Flags [S], seq > > 1204544933, win 29200, options [mss 1320,sackOK,TS val 6032576 ecr > > 0,nop,wscale 7], length 0 > > > > 18:36:00.593171 IP AA.BB.CC.DD.443 > XX.YY.ZZ.VV.24548: Flags [S.], seq > > 986069863, ack 1204544934, win 14480, options [mss 1460,sackOK,TS val > > 180940028 ecr 6001497,nop,wscale 5], length 0 > > > > 18:36:00.992699 IP AA.BB.CC.DD.443 > XX.YY.ZZ.VV.24548: Flags [S.], seq > > 986069863, ack 1204544934, win 14480, options [mss 1460,sackOK,TS val > > 180940128 ecr 6001497,nop,wscale 5], length 0 > > > > > > From these traces, we can guess a SYN packet was sent about 31 seconds > > earlier. > > > > SYNACK rtx do not update the TSECR : Initial SYN TSval value (6001497) > > is mirrored. > > > > Are you establishing many active sessions per minute to this particular > > target ? > > > > Here is a packetdrill test to demonstrate behavior : > > // Test that SYNACK rtx tsecr is not changed (original SYN tsval) > > `../common/defaults.sh > ` > > // Create a socket. > 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 > 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > > 0.000 bind(3, ..., ...) = 0 > 0.000 listen(3, 1) = 0 > > // Establish a connection. > 0.100 < S 0:0(0) win 20000 <mss 1000,sackOK,TS val 100 ecr 0> > +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 100 ecr 100> > > +0.100 < S 0:0(0) win 20000 <mss 1000,sackOK,TS val 199 ecr 0> > // check rtx tsecr is sill 100, not 199 > +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 200 ecr 100> > > +0.100 < . 1:1(0) ack 1 win 20000 <nop,nop,TS val 300 ecr 200> > +0 accept(3, ..., ...) = 4 >
Reading again https://tools.ietf.org/html/rfc7323#section-4.3 I am not sure we do the right thing here. :( Yuchung, Neal, what do you think ? If we receive a SYN matching an existing SYN_RECV, should we update TS.recent with latest SYN.tsval ? I see nothing wrong with that. Patch would be : diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index df7fe3c31162e77b96f81399ef7d893485ab3d91..70db6572d241e132c28c381dfc1155b150c9557b 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -588,6 +588,9 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn && flg == TCP_FLAG_SYN && !paws_reject) { + if (tmp_opt.saw_tstamp && + after(tmp_opt.rcv_tsval, req->ts_recent)) + req->ts_recent = tmp_opt.rcv_tsval; /* * RFC793 draws (Incorrectly! It was fixed in RFC1122) * this case on figure 6 and figure 8, but formal And updated packetdrill test would be : // Test that SYNACK rtx tsecr is not changed (original SYN tsval) `../common/defaults.sh ` // Create a socket. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 // Establish a connection. 0.100 < S 0:0(0) win 20000 <mss 1000,sackOK,TS val 100 ecr 0> +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 100 ecr 100> +0.100 < S 0:0(0) win 20000 <mss 1000,sackOK,TS val 199 ecr 0> // check rtx tsecr is 199, not 100 +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 200 ecr 199> +0.100 < . 1:1(0) ack 1 win 20000 <nop,nop,TS val 300 ecr 200> +0 accept(3, ..., ...) = 4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html