On 4/29/19 3:46 PM, Yuchung Cheng wrote: > Linux implements RFC6298 and use an initial congestion window > of 1 upon establishing the connection if the SYNACK packet is > retransmitted 2 or more times. In cellular networks SYNACK timeouts > are often spurious if the wireless radio was dormant or idle. Also > some network path is longer than the default SYNACK timeout. In > both cases falsely starting with a minimal cwnd are detrimental > to performance. > > This patch avoids doing so when the final ACK's TCP timestamp > indicates the original SYNACK was delivered. It remembers the > original SYNACK timestamp when SYNACK timeout has occurred and > re-uses the function to detect spurious SYN timeout conveniently. > > Note that a server may receives multiple SYNs from and immediately > retransmits SYNACKs without any SYNACK timeout. This often happens > on when the client SYNs have timed out due to wireless delay > above. In this case since the server will still use the default > initial congestion (e.g. 10) because tp->undo_marker is reset in > tcp_init_metrics(). This is an intentional design because packets > are not lost but delayed. > > This patch only covers regular TCP passive open. Fast Open is > supported in the next patch. > > Signed-off-by: Yuchung Cheng <ych...@google.com> > Signed-off-by: Neal Cardwell <ncardw...@google.com> > Signed-off-by: Eric Dumazet <eduma...@google.com> > --- > net/ipv4/tcp_input.c | 2 ++ > net/ipv4/tcp_minisocks.c | 5 +++++ > 2 files changed, 7 insertions(+) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 30c6a42b1f5b..53b4c5a3113b 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -6101,6 +6101,8 @@ int tcp_rcv_state_process(struct sock *sk, struct > sk_buff *skb) > */ > tcp_rearm_rto(sk); > } else { > + tcp_try_undo_spurious_syn(sk); > + tp->retrans_stamp = 0; > tcp_init_transfer(sk, > BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB); > tp->copied_seq = tp->rcv_nxt; > } > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index 79900f783e0d..9c2a0d36fb20 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -522,6 +522,11 @@ struct sock *tcp_create_openreq_child(const struct sock > *sk, > newtp->rx_opt.ts_recent_stamp = 0; > newtp->tcp_header_len = sizeof(struct tcphdr); > } > + if (req->num_timeout) { It seems that req->num_timeout could contain garbage value at this point. That is because we clear req->num_timeout late (in reqsk_queue_hash_req()) I will send a fix. > + newtp->undo_marker = treq->snt_isn; > + newtp->retrans_stamp = div_u64(treq->snt_synack, > + USEC_PER_SEC / TCP_TS_HZ); > + } > newtp->tsoffset = treq->ts_off; > #ifdef CONFIG_TCP_MD5SIG > newtp->md5sig_info = NULL; /*XXX*/ >