On Tue, Nov 21, 2017 at 7:01 AM, Neal Cardwell <ncardw...@google.com> wrote: > > On Tue, Nov 21, 2017 at 12:58 AM, Steve Ibanez <siba...@stanford.edu> wrote: > > Hi Neal, > > > > I tried your suggestion to disable tcp_tso_should_defer() and it does > > indeed look like it is preventing the host from entering timeouts. > > I'll have to do a bit more digging to try and find where the packets > > are being dropped. I've verified that the bottleneck link queue is > > capacity is at about the configured marking threshold when the timeout > > occurs, so the drops may be happening at the NIC interfaces or perhaps > > somewhere unexpected in the switch. > > Great! Thanks for running that test. > > > I wonder if you can explain why the TLP doesn't fire when in the CWR > > state? It seems like that might be worth having for cases like this. > > The original motivation for only allowing TLP in the CA_Open state was > to be conservative and avoid having the TLP impose extra load on the > bottleneck when it may be congested. Plus if there are any SACKed > packets in the SACK scoreboard then there are other existing > mechanisms to do speedy loss recovery. Neal I like your idea of covering more states in TLP. but shouldn't we also fix the tso_deferral_logic to work better w/ PRR in CWR state, b/c it's a general transmission issue.
> > But at various times we have talked about expanding the set of > scenarios where TLP is used. And I think this example demonstrates > that there is a class of real-world cases where it probably makes > sense to allow TLP in the CWR state. > > If you have time, would you be able to check if leaving > tcp_tso_should_defer () as-is but enabling TLP probes in CWR state > also fixes your performance issue? Perhaps something like > (uncompiled/untested): > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 4ea79b2ad82e..deccf8070f84 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -2536,11 +2536,11 @@ bool tcp_schedule_loss_probe(struct sock *sk, > bool advancing_rto) > > early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans; > /* Schedule a loss probe in 2*RTT for SACK capable connections > - * in Open state, that are either limited by cwnd or application. > + * not in loss recovery, that are either limited by cwnd or > application. > */ > if ((early_retrans != 3 && early_retrans != 4) || > !tp->packets_out || !tcp_is_sack(tp) || > - icsk->icsk_ca_state != TCP_CA_Open) > + icsk->icsk_ca_state >= TCP_CA_Recovery) > return false; > > if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) && > > > Btw, thank you very much for all the help! It is greatly appreciated :) > > You are very welcome! :-) > > cheers, > neal