Funny, this was the exact topic of my e-mail just 2 days ago (tcp_input's header prediction and a collapsing send window). But you may have done a better job of explaining.
My symptom was slightly different. When the send window drops below a segment size, tcp_output() stops sending. It only sends if it can send a whole packet, or if there's only a fractional bit left to send. Transmission was completely halted. I recommend a slightly different change. The location you've chosen to update wl1 and wl2 is not sufficent to guarantee that fast path processing will occur. I recommend placing wl1 and wl2 updates in the code blocks that actually avoid the window update code. if ( fast path ) { not so good here... if (tlen == 0) { if (SEQ_GT(th->th_ack, tp->snd_una) && SEQ_LEQ(th->th_ack, tp->snd_max) && tp->snd_cwnd >= tp->snd_wnd && tp->t_dupacks < tp->t_rexmtacks) { better in here... fast path ack processing return; } else if (th->th_ack == tp->snd_una && LIST_EMPTY(&tp->t_segq) && tlen <= sbspace(&so->so_rcv)) { and in here... fast path data processing return; } } ... if ( wl1 ... wl2 ... ) { do window update } The inner conditions can also prevent fast path processing. Placing the wl1 & wl2 updates right off has the potential of causing window updates to be missed. There are two other variables besides wl1 and wl2 that are problematic. rcv_up, and snd_recover are also in need of updating. rcv_up can easily be updated along with wl1. However, snd_recover is not as obvious to me. We could inadvertently avoid fast recovery in the condition described below. Should we update snd_recover along side wl1 as well? Regards, Bill Baumann On Wed, 17 Apr 2002, Damon Permezel wrote: > Not sure about the initial delays, but I found a bug which does cause > throughput to drop dramatically once it is hit. > > Consider the sender of a bulk data transfer (1/2 duplex). > When header prediction is successful, the ACK stream coming back > is handled by the fast path code. For this to be true, the window info > in that ACK stream cannot change. > > tiwin && tiwin == tp->snd_wnd > > > When the window finally does change, we go to the slow path. > There is a check against WL1 and WL2 to ensure that the window update is > not from an earlier packet. The fast-path code does not drag WL1/WL2 > along. In this example, only WL2 (the ACK) changes, as the peer is sending > no data. Because WL2 has not been dragged along, the check to see if the > current ACK is "fresh" can fail. I was seeing this all the time with > a gigabit-rate xfer I was using for a test. > > Because the slow-path ACKs do not update the window, the snd_wnd closes, > and becomes 0. > > Once snd_wnd becomes zero, we end up performing zero-window probes on the > sender. We send one byte. The receiver, meanwhile, has opened his > window up again, and ACKs the byte, with an open window. > > The slow-path processing of this ACK will still ignore the window update, > and the following code will set snd_wnd to -1. > > if (acked > so->so_snd.sb_cc) { > tp->snd_wnd -= so->so_snd.sb_cc; > sbdrop(&so->so_snd, (int)so->so_snd.sb_cc); > ourfinisacked = 1; > } else { > sbdrop(&so->so_snd, acked); > tp->snd_wnd -= acked; > ourfinisacked = 0; > } > > > acked was 1, tp->snd_wnd was 0. snd_wnd is unsigned. > We suddenly have a huge send window. Blast away! ... except we end up being > driven by the rexmit timer and slow start, again and again. > I am not really sure what happens in this mode. The activity light on the > link turns out except for brief, intermittent flashes, so I suppose it > rexmits, slow starts, gets going, overruns the window, ..... > > The snd_wnd keeps being decremented and will, given sufficient time, wrap > down to reasonable values, and this might recover. I have not had the > patience. > > There is a simple fix: > > if (tp->t_state == TCPS_ESTABLISHED && > (thflags & (TH_SYN|TH_FIN|TH_RST|TH_URG|TH_ACK)) == TH_ACK && > ((tp->t_flags & (TF_NEEDSYN|TF_NEEDFIN)) == 0) && > ((to.to_flags & TOF_TS) == 0 || > TSTMP_GEQ(to.to_tsval, tp->ts_recent)) && > /* > * Using the CC option is compulsory if once started: > * the segment is OK if no T/TCP was negotiated or > * if the segment has a CC option equal to CCrecv > */ > ((tp->t_flags & (TF_REQ_CC|TF_RCVD_CC)) != (TF_REQ_CC|TF_RCVD_CC) || > ((to.to_flags & TOF_CC) != 0 && to.to_cc == tp->cc_recv)) && > th->th_seq == tp->rcv_nxt && > tiwin && tiwin == tp->snd_wnd && > tp->snd_nxt == tp->snd_max) { > > /* > * drag along the snd_wl1 and snd_wl2 as we are implicitly > * updating the window with the new (same) value. > */ > > tp->snd_wl1 = th->th_seq; > tp->snd_wl2 = th->th_ack; > > BTW: I have not made this change to FreeBSD, but to a FreeBSD-derived > embedded network stack, so I can't even assure you that the above > two lines compile (as I had to edit them slightly to recast into the F-BSD > idiom). > > > Cheers, > Damon. > > > On Wed, Apr 17, 2002 at 01:13:47PM +0400, Alexander Isaev wrote: > > > > I have installed FreeBSD 4.5. Everything worked OK from the console. > > But when I tried to connect to it remotely (using SSH) I had some network >troubles. > > From time to time to time the connection hangs for a short time. > > First of all I've tried to install another network card (I've replaced > > D-Link 550 with D-Link 538TX). But the problem still exists. Later > > I've noticed that network timeouts happen also when sending or > > receiving large files over SMTP/POP3. > > > > Can someone help me to solve this problem? > > > > Best regards, > > Alexander Isaev mailto:A.Isaev@;astelit.ru > > > > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > > with "unsubscribe freebsd-net" in the body of the message > > -- > -- > Damon Permezel > [EMAIL PROTECTED] > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-net" in the body of the message > > To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message