[PATCH net-next] tcp: shrink inet_connection_sock icsk_mtup enabled and probe_size

2021-01-29 Thread Neal Cardwell
From: Neal Cardwell This commit shrinks inet_connection_sock by 4 bytes, by shrinking icsk_mtup.enabled from 32 bits to 1 bit, and shrinking icsk_mtup.probe_size from s32 to an unsuigned 31 bit field. This is to save space to compensate for the recent introduction of a new u32 in

[PATCH net] tcp: fix cwnd-limited bug for TSO deferral where we send nothing

2020-12-08 Thread Neal Cardwell
From: Neal Cardwell When cwnd is not a multiple of the TSO skb size of N*MSS, we can get into persistent scenarios where we have the following sequence: (1) ACK for full-sized skb of N*MSS arrives -> tcp_write_xmit() transmit full-sized skb with N*MSS -> move pacing release time f

[PATCH net] tcp: fix to update snd_wl1 in bulk receiver fast path

2020-10-22 Thread Neal Cardwell
From: Neal Cardwell In the header prediction fast path for a bulk data receiver, if no data is newly acknowledged then we do not call tcp_ack() and do not call tcp_ack_update_window(). This means that a bulk receiver that receives large amounts of data can have the incoming sequence numbers wrap

[PATCH bpf-next v3 5/5] tcp: simplify tcp_set_congestion_control() load=false case

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Simplify tcp_set_congestion_control() by removing the initialization code path for the !load case. There are only two call sites for tcp_set_congestion_control(). The EBPF call site is the only one that passes load=false; it also passes cap_net_admin=true. Because of that

[PATCH bpf-next v3 0/5] tcp: increase flexibility of EBPF congestion control initialization

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell This patch series reorganizes TCP congestion control initialization so that if EBPF code called by tcp_init_transfer() sets the congestion control algorithm by calling setsockopt(TCP_CONGESTION) then the TCP stack initializes the congestion control module immediately, instead

[PATCH bpf-next v3 4/5] tcp: simplify _bpf_setsockopt(): remove flags argument

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Now that the previous patches have removed the code that uses the flags argument to _bpf_setsockopt(), we can remove that argument. Signed-off-by: Neal Cardwell Acked-by: Yuchung Cheng Acked-by: Kevin Yang Signed-off-by: Eric Dumazet Cc: Lawrence Brakmo --- net/core

[PATCH bpf-next v3 2/5] tcp: simplify EBPF TCP_CONGESTION to always init CC

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Now that the previous patch ensures we don't initialize the congestion control twice, when EBPF sets the congestion control algorithm at connection establishment we can simplify the code by simply initializing the congestion control module at that time. Signed-off-by:

[PATCH bpf-next v3 3/5] tcp: simplify tcp_set_congestion_control(): always reinitialize

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Now that the previous patches ensure that all call sites for tcp_set_congestion_control() want to initialize congestion control, we can simplify tcp_set_congestion_control() by removing the reinit argument and the code to support it. Signed-off-by: Neal Cardwell Acked-by

[PATCH bpf-next v3 1/5] tcp: only init congestion control if not initialized already

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Change tcp_init_transfer() to only initialize congestion control if it has not been initialized already. With this new approach, we can arrange things so that if the EBPF code sets the congestion control by calling setsockopt(TCP_CONGESTION) then tcp_init_transfer() will not

[PATCH bpf-next v2 0/5] tcp: increase flexibility of EBPF congestion control initialization

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell This patch series reorganizes TCP congestion control initialization so that if EBPF code called by tcp_init_transfer() sets the congestion control algorithm by calling setsockopt(TCP_CONGESTION) then the TCP stack initializes the congestion control module immediately, instead

[PATCH bpf-next v2 1/5] tcp: only init congestion control if not initialized already

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Change tcp_init_transfer() to only initialize congestion control if it has not been initialized already. With this new approach, we can arrange things so that if the EBPF code sets the congestion control by calling setsockopt(TCP_CONGESTION) then tcp_init_transfer() will not

[PATCH bpf-next v2 2/5] tcp: simplify EBPF TCP_CONGESTION to always init CC

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Now that the previous patch ensures we don't initialize the congestion control twice, when EBPF sets the congestion control algorithm at connection establishment we can simplify the code by simply initializing the congestion control module at that time. Signed-off-by:

[PATCH bpf-next v2 5/5] tcp: simplify tcp_set_congestion_control() load=false case

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Simplify tcp_set_congestion_control() by removing the initialization code path for the !load case. There are only two call sites for tcp_set_congestion_control(). The EBPF call site is the only one that passes load=false; it also passes cap_net_admin=true. Because of that

[PATCH bpf-next v2 4/5] tcp: simplify _bpf_setsockopt(): remove flags argument

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Now that the previous patches have removed the code that uses the flags argument to _bpf_setsockopt(), we can remove that argument. Signed-off-by: Neal Cardwell Acked-by: Yuchung Cheng Acked-by: Kevin Yang Signed-off-by: Eric Dumazet Cc: Lawrence Brakmo --- net/core

[PATCH bpf-next v2 3/5] tcp: simplify tcp_set_congestion_control(): always reinitialize

2020-09-10 Thread Neal Cardwell
From: Neal Cardwell Now that the previous patches ensure that all call sites for tcp_set_congestion_control() want to initialize congestion control, we can simplify tcp_set_congestion_control() by removing the reinit argument and the code to support it. Signed-off-by: Neal Cardwell Acked-by

Re: [net-next] tcp: add TCP_INFO status for failed client TFO

2019-10-21 Thread Neal Cardwell
On Mon, Oct 21, 2019 at 5:11 PM Jason Baron wrote: > > > > On 10/21/19 4:36 PM, Eric Dumazet wrote: > > On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch wrote: > >> > > > >> Actually, longterm I hope we would be able to get rid of the > >> blackhole-detection and fallback heuristics. In a far di

Re: Crash when receiving FIN-ACK in TCP_FIN_WAIT1 state

2019-10-21 Thread Neal Cardwell
On Mon, Oct 21, 2019 at 8:04 PM Subash Abhinov Kasiviswanathan wrote: > > > Interesting! As tcp_input.c summarizes, "packets_out is > > SND.NXT-SND.UNA counted in packets". In the normal operation of a > > socket, tp->packets_out should not be 0 if any of those other fields > > are non-zero. > > >

Re: Crash when receiving FIN-ACK in TCP_FIN_WAIT1 state

2019-10-21 Thread Neal Cardwell
On Sun, Oct 20, 2019 at 10:45 PM Subash Abhinov Kasiviswanathan wrote: > > > FIN-WAIT1 just means the local application has called close() or > > shutdown() to shut down the sending direction of the socket, and the > > local TCP stack has sent a FIN, and is waiting to receive a FIN and an > > ACK

Re: Crash when receiving FIN-ACK in TCP_FIN_WAIT1 state

2019-10-20 Thread Neal Cardwell
On Sun, Oct 20, 2019 at 7:15 PM Subash Abhinov Kasiviswanathan wrote: > > > Hmm. Random related thought while searching for a possible cause: I > > wonder if tcp_write_queue_purge() should clear tp->highest_sack (and > > possibly tp->sacked_out)? The tcp_write_queue_purge() code is careful > > to

Re: Crash when receiving FIN-ACK in TCP_FIN_WAIT1 state

2019-10-20 Thread Neal Cardwell
tcp_write_queue_purgeOn Sun, Oct 20, 2019 at 4:25 PM Subash Abhinov Kasiviswanathan wrote: > > We are seeing a crash in the TCP ACK codepath often in our regression > racks with an ARM64 device with 4.19 based kernel. > > It appears that the tp->highest_ack is invalid when being accessed when > a

Re: [net-next] tcp: add TCP_INFO status for failed client TFO

2019-10-18 Thread Neal Cardwell
other failures, such as SYN/ACK + data being dropped, will result in the > connection not becoming established. And a connection blackhole after > session establishment shows up as a stalled connection. > > Signed-off-by: Jason Baron > Cc: Eric Dumazet > Cc: Neal Cardwell > Cc:

Re: [PATCH v2] tcp: Add TCP_INFO counter for packets received out-of-order

2019-09-17 Thread Neal Cardwell
On Tue, Sep 17, 2019 at 1:22 PM Eric Dumazet wrote: > > Tue, Sep 17, 2019 at 10:13 AM Jason Baron wrote: > > > > > > Hi, > > > > I was interested in adding a field to tcp_info around the TFO state of a > > socket. So for the server side it would indicate if TFO was used to > > create the socket

Re: [PATCH v5 2/2] tcp: Add snd_wnd to TCP_INFO

2019-09-14 Thread Neal Cardwell
On Fri, Sep 13, 2019 at 7:23 PM Thomas Higdon wrote: > > Neal Cardwell mentioned that snd_wnd would be useful for diagnosing TCP > performance problems -- > > (1) Usually when we're diagnosing TCP performance problems, we do so > > from the sender, since th

Re: [PATCH v5 1/2] tcp: Add TCP_INFO counter for packets received out-of-order

2019-09-14 Thread Neal Cardwell
_INFO, and > has the same name. > > Also note that we avoid increasing the size of the tcp_sock struct by > taking advantage of a hole. > > Signed-off-by: Thomas Higdon > --- > changes since v4: > - optimize placement of rcv_ooopack to avoid increasing tcp_sock struct >

Re: [PATCH v4 2/2] tcp: Add snd_wnd to TCP_INFO

2019-09-13 Thread Neal Cardwell
On Fri, Sep 13, 2019 at 5:29 PM Yuchung Cheng wrote: > > What if the comment is shortened up to fit in 80 columns and the units > > (bytes) are added, something like: > > > > __u32 tcpi_snd_wnd;/* peer's advertised recv window > > (bytes) */ > just a thought: will tcpi_peer_rcv_

Re: [PATCH v4 2/2] tcp: Add snd_wnd to TCP_INFO

2019-09-13 Thread Neal Cardwell
On Fri, Sep 13, 2019 at 3:36 PM Thomas Higdon wrote: > > Neal Cardwell mentioned that snd_wnd would be useful for diagnosing TCP > performance problems -- > > (1) Usually when we're diagnosing TCP performance problems, we do so > > from the sender, since th

Re: [PATCH v4 1/2] tcp: Add TCP_INFO counter for packets received out-of-order

2019-09-13 Thread Neal Cardwell
On Fri, Sep 13, 2019 at 3:37 PM Thomas Higdon wrote: > > For receive-heavy cases on the server-side, we want to track the > connection quality for individual client IPs. This counter, similar to > the existing system-wide TCPOFOQueue counter in /proc/net/netstat, > tracks out-of-order packet recep

Re: [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO

2019-09-13 Thread Neal Cardwell
On Fri, Sep 13, 2019 at 10:29 AM Thomas Higdon wrote: > > On Thu, Sep 12, 2019 at 10:14:33AM +0100, Dave Taht wrote: > > On Thu, Sep 12, 2019 at 1:59 AM Neal Cardwell wrote: > > > > > > On Wed, Sep 11, 2019 at 6:32 PM Thomas Higdon wrote: > > > > &g

Re: [PATCH v3 2/2] tcp: Add rcv_wnd to TCP_INFO

2019-09-11 Thread Neal Cardwell
On Wed, Sep 11, 2019 at 6:32 PM Thomas Higdon wrote: > > Neal Cardwell mentioned that rcv_wnd would be useful for helping > diagnose whether a flow is receive-window-limited at a given instant. > > This serves the purpose of adding an additional __u32 to avoid the > would-be

Re: [PATCH net-next] tcp: force a PSH flag on TSO packets

2019-09-10 Thread Neal Cardwell
ivers. > > It has been used at Google for about four years, > and has been discussed at various networking conferences. > > [1] segments smaller than MSS already have PSH flag set > by tcp_sendmsg() / tcp_mark_push(), unless MSG_MORE > has been requested by the user. >

Re: [PATCH v2] tcp: Add TCP_INFO counter for packets received out-of-order

2019-09-10 Thread Neal Cardwell
On Tue, Sep 10, 2019 at 4:39 PM Eric Dumazet wrote: > > On Tue, Sep 10, 2019 at 10:11 PM Thomas Higdon wrote: > > > > > ... > > Because an additional 32-bit member in struct tcp_info would cause > > a hole on 64-bit systems, we reserve a struct member '_reserved'. > ... > > diff --git a/include/u

[PATCH net] tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR

2019-09-09 Thread Neal Cardwell
p; remove it") Signed-off-by: Neal Cardwell Acked-by: Yuchung Cheng Acked-by: Soheil Hassas Yeganeh Cc: Eric Dumazet --- net/ipv4/tcp_input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c21e8a22fb3b..8a1cd93

Re: [PATCH net] tcp: remove empty skb from write queue in error cases

2019-08-26 Thread Neal Cardwell
and > call sk->sk_write_space(sk) accordingly. > > Fixes: ce5ec440994b ("tcp: ensure epoll edge trigger wakeup when write queue > is empty") > Signed-off-by: Eric Dumazet > Cc: Jason Baron > Reported-by: Vladimir Rutsky > Cc: Soheil Hassas Yeganeh >

Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed

2019-08-17 Thread Neal Cardwell
o renames the do_nonblock label since we might reach this > code path even if we were in blocking mode. > > Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure") > Signed-off-by: Eric Dumazet > Cc: Jason Baron > Reported-by: Vladimir Rutsky > ---

Re: [PATCH v2 1/2] tcp: add new tcp_mtu_probe_floor sysctl

2019-08-08 Thread Neal Cardwell
ntrol the floor of MSS probing. > > Signed-off-by: Josh Hunt > --- Acked-by: Neal Cardwell Thanks, Josh. I agree with Eric that it would be great if you are able to share the value that you have found to work well. neal

Re: [PATCH v2 2/2] tcp: Update TCP_BASE_MSS comment

2019-08-08 Thread Neal Cardwell
On Thu, Aug 8, 2019 at 2:13 AM Eric Dumazet wrote: > On 8/8/19 1:52 AM, Josh Hunt wrote: > > TCP_BASE_MSS is used as the default initial MSS value when MTU probing is > > enabled. Update the comment to reflect this. > > > > Suggested-by: Neal Cardwell &

Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits

2019-08-02 Thread Neal Cardwell
On Fri, Aug 2, 2019 at 3:03 PM Bernd wrote: > > Hello, > > While analyzing a aborted upload packet capture I came across a odd > trace where a sender was not responding to a duplicate SACK but > sending further segments until it stalled. > > Took me some time until I remembered this fix, and actua

Re: [PATCH] tcp: add new tcp_mtu_probe_floor sysctl

2019-07-29 Thread Neal Cardwell
On Sun, Jul 28, 2019 at 5:14 PM Josh Hunt wrote: > > On 7/28/19 6:54 AM, Eric Dumazet wrote: > > On Sun, Jul 28, 2019 at 1:21 AM Josh Hunt wrote: > >> > >> On 7/27/19 12:05 AM, Eric Dumazet wrote: > >>> On Sat, Jul 27, 2019 at 4:23 AM Josh Hunt wrote: > > The current implementation of

Re: [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook

2019-07-18 Thread Neal Cardwell
b21c7c16 ("bpf: Add support for changing congestion control") > Signed-off-by: Eric Dumazet > Cc: Lawrence Brakmo > Reported-by: Neal Cardwell > --- Acked-by: Neal Cardwell Thanks, Eric! neal

Re: Kernel BUG: epoll_wait() (and epoll_pwait) stall for 206 ms per call on sockets with a small-ish snd/rcv buffer.

2019-07-08 Thread Neal Cardwell
On Sat, Jul 6, 2019 at 2:19 PM Carlo Wood wrote: > > While investigating this further, I read on > http://www.masterraghu.com/subjects/np/introduction/unix_network_programming_v1.3/ch07lev1sec5.html > under "SO_RCVBUF and SO_SNDBUF Socket Options": > > When setting the size of the TCP socket r

Re: tp->copied_seq used before assignment in tcp_check_urg

2019-06-11 Thread Neal Cardwell
On Tue, Jun 11, 2019 at 2:46 AM Zhongjie Wang wrote: > > Hi Neal, > > Thanks for your valuable feedback! Yes, I think you are right. > It seems not a problem if tp->urg_data and tp->urg_seq are used together. > From our test results, we can only see there are some paths requiring > specific initia

Re: tp->copied_seq used before assignment in tcp_check_urg

2019-06-10 Thread Neal Cardwell
On Mon, Jun 10, 2019 at 7:48 PM Zhongjie Wang wrote: > > Hi Neal, > > Thanks for your reply. Sorry, I made a mistake in my previous email. > After I double checked the source code, I think it should be tp->urg_seq, > which is used before assignment, instead of tp->copied_seq. > Still in the same i

Re: tp->copied_seq used before assignment in tcp_check_urg

2019-06-10 Thread Neal Cardwell
On Sun, Jun 9, 2019 at 11:12 PM Zhongjie Wang wrote: ... > It compares tp->copied_seq with tcp->rcv_nxt. > However, tp->copied_seq is only assigned to an appropriate sequence number > when > it copies data to user space. So here tp->copied_seq could be equal to 0, > which is its initial value, if

Re: [PATCH net] dctcp: more accurate tracking of packets delivery

2019-04-11 Thread Neal Cardwell
ntly, calling it only once per RTT. > > Signed-off-by: Eric Dumazet > Cc: Yuchung Cheng > Cc: Neal Cardwell > Cc: Soheil Hassas Yeganeh > Cc: Florian Westphal > Cc: Daniel Borkmann > Cc: Lawrence Brakmo > Cc: Abdul Kabbani > --- Thanks, Eric! There is a slight

Re: [PATCH v2 bpf-next 5/7] bpf: sysctl for probe_on_drop

2019-04-08 Thread Neal Cardwell
On Wed, Apr 3, 2019 at 8:13 PM brakmo wrote: > > When a packet is dropped when calling queue_xmit in __tcp_transmit_skb > and packets_out is 0, it is beneficial to set a small probe timer. > Otherwise, the throughput for the flow can suffer because it may need to > depend on the probe timer to st

Re: [PATCH net] tcp: repaired skbs must init their tso_segs

2019-02-23 Thread Neal Cardwell
On Sat, Feb 23, 2019 at 6:51 PM Eric Dumazet wrote: > > syzbot reported a WARN_ON(!tcp_skb_pcount(skb)) > in tcp_send_loss_probe() [1] > > This was caused by TCP_REPAIR sent skbs that inadvertenly > were missing a call to tcp_init_tso_segs() > Acked-by: Neal Cardwell Thanks, Eric! neal

Re: [PATCH net 2/2] tcp: tcp_v4_err() should be more careful

2019-02-15 Thread Neal Cardwell
Signed-off-by: Eric Dumazet > Reported-by: soukjin bae > --- > net/ipv4/tcp_ipv4.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net 1/2] tcp: clear icsk_backoff in tcp_write_queue_purge()

2019-02-15 Thread Neal Cardwell
2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net] tcp: clear tp->retrans_stamp in tcp_finish_connect()

2018-12-19 Thread Neal Cardwell
commit b701a99e431d ("tcp: Add > tcp_clamp_rto_to_user_timeout() helper to improve accuracy"), but > predates git history. > > Signed-off-by: Eric Dumazet > Acked-by: Soheil Hassas Yeganeh > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH v3 net-next 4/4] tcp: implement coalescing on backlog queue

2018-11-28 Thread Neal Cardwell
se on a receiver > without GRO, but the spectacular gain is really on > 1000x release_sock() latency reduction I have measured. > > Signed-off-by: Eric Dumazet > Cc: Neal Cardwell > Cc: Yuchung Cheng > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH v3 net-next 2/4] tcp: take care of compressed acks in tcp_add_reno_sack()

2018-11-28 Thread Neal Cardwell
t; account how many ACK were coalesced, this information > will be available in skb_shinfo(skb)->gso_segs > > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH v2 net-next 4/4] tcp: implement coalescing on backlog queue

2018-11-27 Thread Neal Cardwell
se on a receiver > without GRO, but the spectacular gain is really on > 1000x release_sock() latency reduction I have measured. > > Signed-off-by: Eric Dumazet > Cc: Neal Cardwell > Cc: Yuchung Cheng > --- ... > + if (TCP_SKB_CB(tail)->end_seq != TCP_SKB_CB(skb)-&g

Re: [PATCH v2 net-next 3/4] tcp: make tcp_space() aware of socket backlog

2018-11-27 Thread Neal Cardwell
situation. > > Reported-by: Jean-Louis Dupond > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Nice. Thanks! neal

Re: [PATCH v2 net-next 2/4] tcp: take care of compressed acks in tcp_add_reno_sack()

2018-11-27 Thread Neal Cardwell
On Tue, Nov 27, 2018 at 10:57 AM Eric Dumazet wrote: > > Neal pointed out that non sack flows might suffer from ACK compression > added in the following patch ("tcp: implement coalescing on backlog queue") > > Instead of tweaking tcp_add_backlog() we can take into > account how many ACK were coale

Re: [PATCH v2 net-next 1/4] tcp: hint compiler about sack flows

2018-11-27 Thread Neal Cardwell
On Tue, Nov 27, 2018 at 10:57 AM Eric Dumazet wrote: > > Tell the compiler that most TCP flows are using SACK these days. > > There is no need to add the unlikely() clause in tcp_is_reno(), > the compiler is able to infer it. > > Signed-off-by: Eric Dumazet > --- Acked-

Re: [PATCH net-next 2/3] tcp: implement coalescing on backlog queue

2018-11-22 Thread Neal Cardwell
On Wed, Nov 21, 2018 at 12:52 PM Eric Dumazet wrote: > > In case GRO is not as efficient as it should be or disabled, > we might have a user thread trapped in __release_sock() while > softirq handler flood packets up to the point we have to drop. > > This patch balances work done from user thread

Re: [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies

2018-11-11 Thread Neal Cardwell
reduces bursts for HZ=100 or HZ=250 kernels, making TCP > behavior more uniform. > > Signed-off-by: Eric Dumazet > Acked-by: Soheil Hassas Yeganeh > --- Nice. Thanks! Acked-by: Neal Cardwell neal

Re: [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption

2018-11-11 Thread Neal Cardwell
cs to avoid overflows. > > Signed-off-by: Eric Dumazet > Acked-by: Soheil Hassas Yeganeh > --- > net/ipv4/tcp_output.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) Thanks! Acked-by: Neal Cardwell neal

Re: [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR)

2018-11-11 Thread Neal Cardwell
t; > Signed-off-by: Eric Dumazet > Acked-by: Soheil Hassas Yeganeh > --- > net/ipv4/tcp_output.c | 4 > 1 file changed, 4 insertions(+) Thanks! Acked-by: Neal Cardwell neal

Re: [PATCH net-next] net_sched: sch_fq: add dctcp-like marking

2018-11-11 Thread Neal Cardwell
t fq ce_threshold 2.5ms > > Signed-off-by: Eric Dumazet > --- Very nice! Thanks, Eric. :-) Acked-by: Neal Cardwell neal

[PATCH net-next] tcp_bbr: update comments to reflect pacing_margin_percent

2018-11-08 Thread Neal Cardwell
te an old comment to reflect the new approach. Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh Signed-off-by: Eric Dumazet --- net/ipv4/tcp_bbr.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/net/ipv4/

[PATCH net-next 2/2] tcp_bbr: centralize code to set gains

2018-10-16 Thread Neal Cardwell
Centralize the code that sets gains used for computing cwnd and pacing rate. This simplifies the code and makes it easier to change the state machine or (in the future) dynamically change the gain values and ensure that the correct gain values are always used. Signed-off-by: Neal Cardwell Signed

[PATCH net-next 0/2] tcp_bbr: TCP BBR changes for EDT pacing model

2018-10-16 Thread Neal Cardwell
The second patch adjusts the TCP BBR logic to centralize the setting of gain values, to simplify the code and prepare for future changes. Neal Cardwell (2): tcp_bbr: adjust TCP BBR for departure time pacing tcp_bbr: centralize code to set gains net/ipv4/tcp_bbr.c | 77

[PATCH net-next 1/2] tcp_bbr: adjust TCP BBR for departure time pacing

2018-10-16 Thread Neal Cardwell
nt o if pushing in_network down (pacing_gain < 1.0), then in_network goes below target upon an ACK event This commit changes the BBR state machine to use this estimated "packets in network" value to make its decisions. Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed

Re: Why not use all the syn queues? in the function "tcp_conn_request", I have some questions.

2018-09-08 Thread Neal Cardwell
On Sat, Sep 8, 2018 at 11:23 AM Ttttabcd wrote: > > Thank you very much for your previous answer, sorry for the inconvenience. > > But now I want to ask you one more question. > > The question is why we need two variables to control the syn queue? > > The first is the "backlog" parameter of the "l

Re: Why not use all the syn queues? in the function "tcp_conn_request", I have some questions.

2018-09-04 Thread Neal Cardwell
On Tue, Sep 4, 2018 at 1:48 AM Ttttabcd wrote: > > Hello everyone,recently I am looking at the source code for handling TCP > three-way handshake(Linux Kernel version 4.18.5). > > I found some strange places in the source code for handling syn messages. > > in the function "tcp_conn_request" > >

[PATCH net] tcp_bbr: fix bw probing to raise in-flight data for very small BDPs

2018-07-27 Thread Neal Cardwell
ngestion control") Signed-off-by: Neal Cardwell Acked-by: Yuchung Cheng Acked-by: Soheil Hassas Yeganeh Acked-by: Priyaranjan Jha Reviewed-by: Eric Dumazet --- net/ipv4/tcp_bbr.c | 4 1 file changed, 4 insertions(+) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index

Re: [PATCH net-next] tcp: ack immediately when a cwr packet arrives

2018-07-24 Thread Neal Cardwell
On Tue, Jul 24, 2018 at 1:42 PM Lawrence Brakmo wrote: > > Note that without this fix the 99% latencies when doing 10KB RPCs > in a congested network using DCTCP are 40ms vs. 190us with the patch. > Also note that these 40ms high tail latencies started after commit > 3759824da87b30ce7a35b4873b62b0

Re: [PATCH net-next] tcp: ack immediately when a cwr packet arrives

2018-07-24 Thread Neal Cardwell
On Tue, Jul 24, 2018 at 1:07 PM Yuchung Cheng wrote: > > On Mon, Jul 23, 2018 at 7:23 PM, Daniel Borkmann wrote: > > Should this go to net tree instead where all the other fixes went? > I am neutral but this feels more like a feature improvement I agree this feels like a feature improvement rath

Re: [PATCH net-next] tcp: ack immediately when a cwr packet arrives

2018-07-23 Thread Neal Cardwell
. > Modified based on comments by Neal Cardwell > > Signed-off-by: Lawrence Brakmo > --- > net/ipv4/tcp_input.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) Seems like a nice mechanism to have, IMHO. Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net-next] tcp: expose both send and receive intervals for rate sample

2018-07-09 Thread Neal Cardwell
nding. It does seem to be showing up in patchwork now: https://patchwork.ozlabs.org/patch/941532/ And I can confirm I'm able to apply it to net-next. Acked-by: Neal Cardwell thanks, neal

Re: [PATCH net-next v3 0/2] tcp: fix high tail latencies in DCTCP

2018-07-07 Thread Neal Cardwell
p 99.9% > > 1MB RPCs2.6ms 5.5ms 43ms 208ms > > 10KB RPCs1.1ms 1.3ms 53ms 212ms > ... > > v2: Removed call to tcp_ca_event from tcp_send_ack since I added one in > > tcp_event_ack_sent. Based on Neal Cardwell > >

Re: [PATCH net-next v2 0/2] tcp: fix high tail latencies in DCTCP

2018-07-03 Thread Neal Cardwell
On Tue, Jul 3, 2018 at 11:10 AM Lawrence Brakmo wrote: > > On 7/2/18, 5:52 PM, "netdev-ow...@vger.kernel.org on behalf of Neal Cardwell" > wrote: > > On Mon, Jul 2, 2018 at 5:39 PM Lawrence Brakmo wrote: > > > > When have observed high tail l

Re: [PATCH net-next v2 1/2] tcp: notify when a delayed ack is sent

2018-07-03 Thread Neal Cardwell
On Mon, Jul 2, 2018 at 7:49 PM Yuchung Cheng wrote: > > On Mon, Jul 2, 2018 at 2:39 PM, Lawrence Brakmo wrote: > > > > DCTCP depends on the CA_EVENT_NON_DELAYED_ACK and CA_EVENT_DELAYED_ACK > > notifications to keep track if it needs to send an ACK for packets that > > were received with a partic

Re: [PATCH net-next v2 0/2] tcp: fix high tail latencies in DCTCP

2018-07-02 Thread Neal Cardwell
e current packet should be enough. This should reduce the extra load noticed in DCTCP environments, after congestion events. This is part 2 of our effort to reduce pure ACK packets. Signed-off-by: Eric Dumazet Acked-by: Soheil Hassas Yeganeh Acked-by:

Re: [PATCH net-next 1/2] tcp: notify when a delayed ack is sent

2018-07-02 Thread Neal Cardwell
On Fri, Jun 29, 2018 at 9:48 PM Lawrence Brakmo wrote: > > DCTCP depends on the CA_EVENT_NON_DELAYED_ACK and CA_EVENT_DELAYED_ACK > notifications to keep track if it needs to send an ACK for packets that > were received with a particular ECN state but whose ACK was delayed. > > Under some circumst

Re: [PATCH net-next 2/2] tcp: ack immediately when a cwr packet arrives

2018-07-02 Thread Neal Cardwell
On Sat, Jun 30, 2018 at 9:47 PM Lawrence Brakmo wrote: > I see two issues, one is that entering quickack mode as you > mentioned does not insure that it will still be on when the CWR > arrives. The second issue is that the problem occurs right after the > receiver sends a small reply which results

Re: [PATCH net] tcp: prevent bogus FRTO undos with non-SACK flows

2018-06-30 Thread Neal Cardwell
one ACK later (when we get an ACK that doesn't cover a retransmit). But that seems fine to me. I also cooked the new packetdrill test below to explicitly cover this case you are addressing (please let me know if you have an alternate suggestion). Tested-by: Neal Cardwell Acked-by: Nea

Re: [PATCH net-next 0/2] tcp: fix high tail latencies in DCTCP

2018-06-30 Thread Neal Cardwell
On Fri, Jun 29, 2018 at 9:48 PM Lawrence Brakmo wrote: > > When have observed high tail latencies when using DCTCP for RPCs as > compared to using Cubic. For example, in one setup there are 2 hosts > sending to a 3rd one, with each sender having 3 flows (1 stream, > 1 1MB back-to-back RPCs and 1 1

Re: [PATCH net-next 2/2] tcp: ack immediately when a cwr packet arrives

2018-06-30 Thread Neal Cardwell
On Fri, Jun 29, 2018 at 9:48 PM Lawrence Brakmo wrote: > > We observed high 99 and 99.9% latencies when doing RPCs with DCTCP. The > problem is triggered when the last packet of a request arrives CE > marked. The reply will carry the ECE mark causing TCP to shrink its cwnd > to 1 (because there ar

Re: [PATCH net] tcp: prevent bogus FRTO undos with non-SACK flows

2018-06-29 Thread Neal Cardwell
On Fri, Jun 29, 2018 at 6:07 AM Ilpo Järvinen wrote: > > If SACK is not enabled and the first cumulative ACK after the RTO > retransmission covers more than the retransmitted skb, a spurious > FRTO undo will trigger (assuming FRTO is enabled for that RTO). > The reason is that any non-retransmitte

Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction

2018-06-28 Thread Neal Cardwell
On Thu, Jun 28, 2018 at 4:20 PM Lawrence Brakmo wrote: > > I just looked at 4.18 traces and the behavior is as follows: > >Host A sends the last packets of the request > >Host B receives them, and the last packet is marked with congestion (CE) > >Host B sends ACKs for packets not marke

Re: [PATCH net] tcp: add one more quick ack after after ECN events

2018-06-27 Thread Neal Cardwell
d-off-by: Eric Dumazet > Reported-by: Neal Cardwell > Cc: Lawrence Brakmo > --- > net/ipv4/tcp_input.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Acked-by: Neal Cardwell Thanks, Eric! neal

Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction

2018-06-27 Thread Neal Cardwell
On Tue, Jun 26, 2018 at 10:34 PM Lawrence Brakmo wrote: > The only issue is if it is safe to always use 2 or if it is better to > use min(2, snd_ssthresh) (which could still trigger the problem). Always using 2 SGTM. I don't think we need min(2, snd_ssthresh), as that should be the same as just 2

Re: [PATCH net-next] tcp: remove one indentation level in tcp_create_openreq_child

2018-06-26 Thread Neal Cardwell
On Tue, Jun 26, 2018 at 11:46 AM Eric Dumazet wrote: > > Signed-off-by: Eric Dumazet > --- > net/ipv4/tcp_minisocks.c | 223 --- > 1 file changed, 113 insertions(+), 110 deletions(-) Yes, very nice clean-up! Thanks for doing this. Acked-by

Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs

2018-05-29 Thread Neal Cardwell
On Tue, May 29, 2018 at 11:45 AM Marcelo Ricardo Leitner < marcelo.leit...@gmail.com> wrote: > - patch2 - fix rtx attack vector >- Add the floor value to rto_min to HZ/20 (which fits the values > that Michael shared on the other email) I would encourage allowing minimum RTO values down to

Re: [PATCH net-next 1/2] tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode

2018-05-22 Thread Neal Cardwell
On Tue, May 22, 2018 at 8:31 PM kbuild test robot wrote: > Hi Eric, > Thank you for the patch! Yet something to improve: > [auto build test ERROR on net/master] > [also build test ERROR on v4.17-rc6 next-20180517] > [cannot apply to net-next/master] > [if your patch is applied to the wrong git

Re: [PATCH net-next 2/2] tcp: do not aggressively quick ack after ECN events

2018-05-22 Thread Neal Cardwell
ugh. > This should reduce the extra load noticed in DCTCP environments, > after congestion events. > This is part 2 of our effort to reduce pure ACK packets. > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net-next 1/2] tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode

2018-05-22 Thread Neal Cardwell
On Mon, May 21, 2018 at 6:09 PM Eric Dumazet wrote: > We want to add finer control of the number of ACK packets sent after > ECN events. > This patch is not changing current behavior, it only enables following > change. > Signed-off-by: Eric Dumazet > --- Acked-by: Neal

Re: [PATCH v3 net-next 6/6] tcp: add tcp_comp_sack_nr sysctl

2018-05-17 Thread Neal Cardwell
On Thu, May 17, 2018 at 5:47 PM Eric Dumazet wrote: > This per netns sysctl allows for TCP SACK compression fine-tuning. > This limits number of SACK that can be compressed. > Using 0 disables SACK compression. > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH v3 net-next 5/6] tcp: add tcp_comp_sack_delay_ns sysctl

2018-05-17 Thread Neal Cardwell
On Thu, May 17, 2018 at 5:47 PM Eric Dumazet wrote: > This per netns sysctl allows for TCP SACK compression fine-tuning. > Its default value is 1,000,000, or 1 ms to meet TSO autosizing period. > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH v3 net-next 3/6] tcp: add SACK compression

2018-05-17 Thread Neal Cardwell
counter is added in the following patch. > Two other patches add sysctls to allow changing the 1,000,000 and 44 > values that this commit hard-coded. > Signed-off-by: Eric Dumazet > --- Very nice. I like the constants and the min(rcv_rtt, srtt). Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net-next 3/4] tcp: add SACK compression

2018-05-17 Thread Neal Cardwell
On Thu, May 17, 2018 at 11:40 AM Eric Dumazet wrote: > On 05/17/2018 08:14 AM, Neal Cardwell wrote: > > Is there a particular motivation for the cap of 127? IMHO 127 ACKs is quite > > a few to compress. Experience seems to show that it works well to have one > > GRO A

Re: [PATCH net-next 3/4] tcp: add SACK compression

2018-05-17 Thread Neal Cardwell
On Thu, May 17, 2018 at 8:12 AM Eric Dumazet wrote: > When TCP receives an out-of-order packet, it immediately sends > a SACK packet, generating network load but also forcing the > receiver to send 1-MSS pathological packets, increasing its > RTX queue length/depth, and thus processing time. > W

Re: [PATCH net-next 2/4] tcp: do not force quickack when receiving out-of-order packets

2018-05-17 Thread Neal Cardwell
compression or losses. > We plan to add SACK compression in the following patch, we > must therefore not call tcp_enter_quickack_mode() > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net-next 4/4] tcp: add TCPAckCompressed SNMP counter

2018-05-17 Thread Neal Cardwell
0.0 > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net-next 1/4] tcp: use __sock_put() instead of sock_put() in tcp_clear_xmit_timers()

2018-05-17 Thread Neal Cardwell
On Thu, May 17, 2018 at 8:12 AM Eric Dumazet wrote: > Socket can not disappear under us. > Signed-off-by: Eric Dumazet > --- Acked-by: Neal Cardwell Thanks! neal

Re: [PATCH net] tcp: purge write queue in tcp_connect_init()

2018-05-15 Thread Neal Cardwell
seq. > This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE() > kernel BUG at net/ipv4/tcp_output.c:2837! ... > Fixes: cf60af03ca4e ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)") > Signed-off-by: Eric Dumazet > Cc: Yuchung Cheng > Cc: Neal Cardwell

Re: [PATCH net] tcp: restore autocorking

2018-05-03 Thread Neal Cardwell
retransmit queue") > Signed-off-by: Eric Dumazet > Reported-by: Michael Wenig > Tested-by: Michael Wenig > --- Acked-by: Neal Cardwell Nice. Thanks, Eric! neal

[PATCH net] tcp_bbr: fix to zero idle_restart only upon S/ACKed data

2018-05-01 Thread Neal Cardwell
arting). This commit is a stable candidate for kernels back as far as 4.9. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh Signed-off-by: Priyaranjan Jha Signed-off-by: Yousuk

  1   2   3   4   >