The pkts_acked is clearly not the correct value for slow start after RTO as it may include segments that were not lost and therefore did not need retransmissions in the slow start following the RTO. Then tcp_slow_start will add the excess into cwnd bloating it and triggering a burst.
Instead, we want to pass only the number of retransmitted segments that were covered by the cumulative ACK (and potentially newly sent data segments too if the cumulative ACK covers that far). Signed-off-by: Ilpo Järvinen <ilpo.jarvi...@helsinki.fi> --- net/ipv4/tcp_input.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9a1b3c1..0305f6d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3027,6 +3027,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, long seq_rtt_us = -1L; long ca_rtt_us = -1L; u32 pkts_acked = 0; + u32 rexmit_acked = 0; + u32 newdata_acked = 0; u32 last_in_flight = 0; bool rtt_update; int flag = 0; @@ -3056,8 +3058,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, } if (unlikely(sacked & TCPCB_RETRANS)) { - if (sacked & TCPCB_SACKED_RETRANS) + if (sacked & TCPCB_SACKED_RETRANS) { tp->retrans_out -= acked_pcount; + rexmit_acked += acked_pcount; + } flag |= FLAG_RETRANS_DATA_ACKED; } else if (!(sacked & TCPCB_SACKED_ACKED)) { last_ackt = skb->skb_mstamp; @@ -3068,8 +3072,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, last_in_flight = TCP_SKB_CB(skb)->tx.in_flight; if (before(start_seq, reord)) reord = start_seq; - if (!after(scb->end_seq, tp->high_seq)) + if (!after(scb->end_seq, tp->high_seq)) { flag |= FLAG_ORIG_SACK_ACKED; + } else { + newdata_acked += acked_pcount; + } } if (sacked & TCPCB_SACKED_ACKED) { @@ -3151,6 +3158,14 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, } if (tcp_is_reno(tp)) { + /* Due to discontinuity on RTO in the artificial + * sacked_out calculations, TCP must restrict + * pkts_acked without SACK to rexmits and new data + * segments + */ + if (icsk->icsk_ca_state == TCP_CA_Loss) + pkts_acked = rexmit_acked + newdata_acked; + tcp_remove_reno_sacks(sk, pkts_acked); } else { int delta; -- 2.7.4