From: Neal Cardwell <ncardw...@google.com> Date: Fri, 17 Nov 2017 21:06:14 -0500
> Fix the TLP scheduling logic so that when scheduling a TLP probe, we > ensure that the estimated time at which an RTO would fire accounts for > the fact that ACKs indicating forward progress should push back RTO > times. > > After the following fix: > > df92c8394e6e ("tcp: fix xmit timer to only be reset if data ACKed/SACKed") > > we had an unintentional behavior change in the following kind of > scenario: suppose the RTT variance has been very low recently. Then > suppose we send out a flight of N packets and our RTT is 100ms: > > t=0: send a flight of N packets > t=100ms: receive an ACK for N-1 packets > > The response before df92c8394e6e that was: > -> schedule a TLP for now + RTO_interval > > The response after df92c8394e6e is: > -> schedule a TLP for t=0 + RTO_interval > > Since RTO_interval = srtt + RTT_variance, this means that we have > scheduled a TLP timer at a point in the future that only accounts for > RTT_variance. If the RTT_variance term is small, this means that the > timer fires soon. > > Before df92c8394e6e this would not happen, because in that code, when > we receive an ACK for a prefix of flight, we did: > > 1) Near the top of tcp_ack(), switch from TLP timer to RTO > at write_queue_head->paket_tx_time + RTO_interval: > if (icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) > tcp_rearm_rto(sk); > > 2) In tcp_clean_rtx_queue(), update the RTO to now + RTO_interval: > if (flag & FLAG_ACKED) { > tcp_rearm_rto(sk); > > 3) In tcp_ack() after tcp_fastretrans_alert() switch from RTO > to TLP at now + RTO_interval: > if (icsk->icsk_pending == ICSK_TIME_RETRANS) > tcp_schedule_loss_probe(sk); > > In df92c8394e6e we removed that 3-phase dance, and instead directly > set the TLP timer once: we set the TLP timer in cases like this to > write_queue_head->packet_tx_time + RTO_interval. So if the RTT > variance is small, then this means that this is setting the TLP timer > to fire quite soon. This means if the ACK for the tail of the flight > takes longer than an RTT to arrive (often due to delayed ACKs), then > the TLP timer fires too quickly. > > Fixes: df92c8394e6e ("tcp: fix xmit timer to only be reset if data > ACKed/SACKed") > Signed-off-by: Neal Cardwell <ncardw...@google.com> > Signed-off-by: Yuchung Cheng <ych...@google.com> > Signed-off-by: Eric Dumazet <eduma...@google.com> Applied and queued up for -stable, thanks.