On 5/28/19 11:28 AM, Sergej Benilov wrote:
> Since commit 605ad7f184b60cfaacbc038aa6c55ee68dee3c89 "tcp: refine TSO
> autosizing",
> the TSQ limit is computed as the smaller of
> sysctl_tcp_limit_output_bytes and max(2 * skb->truesize, sk->sk_pacing_rate
> >> 10).
> For low pacing rates, this approach sets a low limit, reducing throughput
> dramatically.
>
> Compute the limit as the greater of sysctl_tcp_limit_output_bytes and max(2 *
> skb->truesize, sk->sk_pacing_rate >> 10).
>
> Test:
> netperf -H remote -l -2000000 -- -s 1000000
>
> before patch:
>
> MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to remote () port 0
> AF_INET : demo
> Recv Send Send
> Socket Socket Message Elapsed
> Size Size Size Time Throughput
> bytes bytes bytes secs. 10^6bits/sec
>
> 87380 327680 327680 250.17 0.06
>
> after patch:
>
> MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to remote () port 0
> AF_INET : demo
> Recv Send Send
> Socket Socket Message Elapsed
> Size Size Size Time Throughput
> bytes bytes bytes secs. 10^6bits/sec
>
> 87380 327680 327680 1.29 12.54
>
> Signed-off-by: Sergej Benilov <sergej.beni...@googlemail.com>
> ---
> net/ipv4/tcp_output.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index e625be56..71efca72 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2054,7 +2054,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned
> int mss_now, int nonagle,
> * One example is wifi aggregation (802.11 AMPDU)
> */
> limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10);
> - limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
> + limit = max_t(u32, limit, sysctl_tcp_limit_output_bytes);
>
> if (atomic_read(&sk->sk_wmem_alloc) > limit) {
> set_bit(TSQ_THROTTLED, &tp->tsq_flags);
>
NACK to this patch, based on some old linux kernel versions.
The min_t() is here is really what was intended.
You might have an issue on the driver you are using.
Some wifi drivers are now setting a hint, check for sk_pacing_shift_update()
bufferbloat prevention is hard, please do not mess badly with it.