Re: [PATCH net-next v3] tcp: use RFC6298 compliant TCP RTO calculation

Yuchung Cheng Wed, 22 Jun 2016 13:51:56 -0700

On Wed, Jun 22, 2016 at 4:21 AM, Hagen Paul Pfeifer <ha...@jauu.net> wrote:
>
> > On June 22, 2016 at 7:53 AM Yuchung Cheng <ych...@google.com> wrote:
> >
> > Thanks for the patience. I've collected data from some Google Web
> > servers. They serve both a mix of US and SouthAm users using
> > HTTP1 and HTTP2. The traffic is Web browsing (e.g., search, maps,
> > gmails, etc but not Youtube videos). The mean RTT is about 100ms.
> >
> > The user connections were split into 4 groups of different TCP RTO
> > configs. Each group has many millions of connections but the
> > size variation among groups is well under 1%.
> >
> > B: baseline Linux
> > D: this patch
> > R: change RTTYAR averaging as in D, but bound RTO to 1sec per RFC6298
> > Y: change RTTVAR averaging as in D, but bound RTTVAR to 200ms instead (like 
> > B)
> >
> > For mean TCP latency of HTTP responses (first byte sent to last byte
> > acked), B < R < Y < D. But the differences are so insignificant (<1%).
> > The median, 95pctl, and 99pctl has similar indifference. In summary
> > there's hardly visible impact on latency. I also look at only response
> > less than 4KB but do not see a different picture.
> >
> > The main difference is the retransmission rate where R =~ Y < B =~D.
> > R and Y are ~20% lower than B and D. Parsing the SNMP stats reveal
> > more interesting details. The table shows the deltas in percentage to
> > the baseline B.
> >
> >                 D      R     Y
> > ------------------------------
> > Timeout      +12%   -16%  -16%
> > TailLossProb +28%    -7%   -7%
> > DSACK_rcvd   +37%    -7%   -7%
> > Cwnd-undo    +16%   -29%  -29%
> >
> > RTO change affects TLP because TLP will use the min of RTO and TLP
> > timer value to arm the probe timer.
> >
> > The stats indicate that the main culprit of spurious timeouts / rtx is
> > the RTO lower-bound. But they also show the RFC RTTVAR averaging is as
> > good as current Linux approach.
> >
> > Given that I would recommend we revise this patch to use the RFC
> > averaging but keep existing lower-bound (of RTTVAR to 200ms). We can
> > further experiment the lower-bound and change that in a separate
> > patch.
>
> Great news Yuchung!
>
> Then Daniel will prepare v4 with a min-rto lower bound:
>
>     max(RTTVAR, tcp_rto_min_us(struct sock))
>
> Any further suggestions Yuchung, Eric? We will also feed this v4 in our test 
> environment to check the behavior for sender limited, non-continuous flows.
yes a small one: I think the patch should change __tcp_set_rto()
instead of tcp_set_rto() so it applies to recurring timeouts as well.



>
> Hagen

Re: [PATCH net-next v3] tcp: use RFC6298 compliant TCP RTO calculation

Reply via email to