Re: The tale of a TCP bug
Hi again, On Fri, Mar 25, 2011 at 16:40 -0400, John Baldwin wrote: > Reading some more. I'm trying to understand the breakage in your case. > > You are saying that FreeBSD is the sender, who has data to send, yet is not > sending any window probes because it never starts the persist timer when the > initial window is zero? Is that correct? Yes. The receiver never sends a window update on its own, but when probed will "admit" to a bigger window. > And the problem is that the code that uses 'adv' to determine if it > sound send a window update to the remote end is falsely succeeding due > to the overflow causing tcp_output() to 'goto send' but that it then > fails to send any data because it thinks the remote window is full? Yes, as far as I remember (I did that part of debugging 2 Months ago, when I submitted the PR %-) that's what happens. > So one thing I don't quite follow is how you are having rcv_nxt > > rcv_adv. I saw this when the other side would send a window probe, > and then the receiving side would take the -1 remaining window and > explode it into the maximum window size when it ACKd. No, it's not rcv_nxt > rcv_adv. It's (rcv_adv - rcv_nxt) > min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) My sample case has (rcv_adv - rcv_nxt) = 65536, but (TCP_MAXWIN << tp->rcv_scale) = 65535 (as there is no window scaling in effect) > Are you seeing the other end of the connection send a window probe, but > FreeBSD is not setting the persist timer so that it will send its own window > probes? No, the dump looks like this: | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [S], |seq 3339144437, win 65535, options [...], length 0 FreeBSD sending the first SYN. [rcv_adv=0, rcv_nxt=0] | 10.42.0.2.1516 > 10.42.0.25.44852: Flags [S.], |seq 42, ack 3339144438, win 0, length 0 The other end SYN|ACKing with a window size of 0. | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [.], |seq 1, ack 1, win 65535, length 0 FreeBSD ACKing, and (correctly) sending no data. [rcv_adv=67779, rcv_nxt=43], thus resulting in adv=-1/0x At this point amd64 hangs 'forever' as the opposite side doesn't send any packets on its own. On i386 the persist timer is started, and we get: | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [.], |seq 1:2, ack 1, win 65535, length 1 A window probe [a few seconds later] | 10.42.0.2.1516 > 10.42.0.25.44852: Flags [.], |seq 1, ack 2, win 70, length 0 At which point the remote side admits to having the window open which results in the connection working fine after that. CU, Sec -- I know that you believe that you understand what you think I said. But I am not sure you realize, that what you heared is not what i meant. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: The tale of a TCP bug
Hi, > On Fri, Mar 25, 2011 at 16:40 -0400, John Baldwin wrote: > > And the problem is that the code that uses 'adv' to determine if it > > sound send a window update to the remote end is falsely succeeding due > > to the overflow causing tcp_output() to 'goto send' but that it then > > fails to send any data because it thinks the remote window is full? On a whim I wanted to find out, how often that overflow is triggered in normal operation, and whipped up a quick counter-sysctl. --- sys/netinet/tcp_output.c.org2011-01-04 19:27:00.0 +0100 +++ sys/netinet/tcp_output.c2011-03-26 18:49:30.0 +0100 @@ -87,6 +87,11 @@ extern struct mbuf *m_copypack(); #endif +VNET_DEFINE(int, adv_neg) = 0; +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, adv_neg, CTLFLAG_RD, + &VNET_NAME(adv_neg), 1, + "How many times adv got negative"); + VNET_DEFINE(int, path_mtu_discovery) = 1; SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, path_mtu_discovery, CTLFLAG_RW, &VNET_NAME(path_mtu_discovery), 1, @@ -573,6 +578,10 @@ long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) - (tp->rcv_adv - tp->rcv_nxt); + if(min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) < + (tp->rcv_adv - tp->rcv_nxt)) + adv_neg++; + if (adv >= (long) (2 * tp->t_maxseg)) goto send; if (2 * adv >= (long) so->so_rcv.sb_hiwat) I booted my main (web/shell) box with (only) this patch: 11:36PM up 3:50, 1 user, load averages: 2.29, 1.51, 0.73 net.inet.tcp.adv_neg: 2466 That's approximately once every 5 seconds. That's way more often than I suspected. CU, Sec -- I wish there was a knob on the TV to turn up the intelligence. There's a knob called "brightness", but it doesn't seem to work. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Questions on LRO and Delayed ACK
Hi All, 1. If there is hardware support for LRO, (where the hardware delivers coalesces a bunch of consecutive TCP segments into one large TCP Segment), is it enough for the driver to simply post the segment to the host stack via ifp->if_input() ? I mean is there a need to run thru tcp_lro_rx() followed by tcp_lro_flush(). 2. What kind performance improvement does one get using soft lro via tcp_lro_init(); tcp_lro_rx();tcp_lro_flush(); 3. In the absence of LRO, is there any way that one can increase the number of inbound frames for which an ACK is transmitted to a value greater than 2? Thanks david S. This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"