From: Yuchung Cheng <ych...@google.com> Date: Wed, 16 Jan 2019 15:05:27 -0800
> This patch set aims to improve how TCP handle local qdisc congestion > by simplifying the previous implementation. Previously when an > skb fails to (re)transmit due to local qdisc congestion or other > resource issue, TCP refrains from setting the skb timestamp or the > recovery starting time. > > This design makes determining when to abort a stalling socket more > complicated, as the timestamps of these tranmission attempts were > missing. The stack needs to sort of infer when the original attempt > happens. A by-product is a socket may disregard the system timeout > limit (i.e. sysctl net.ipv4.tcp_retries2 or USER_TIMEOUT option), > and continue to retry until the transmission is successful. > > In data-center environment when TCP RTO is small, this could cause > the socket to retry frequently for long during qdisc congestion. > > The solution is to first unconditionally timestamp skb and recovery > attempt. Then retry more conservatively (twice a second) on local > qdisc congestion but abort the sockets according to the system limit. Series applied, thanks.