I think I've located a problem in TCP input processing...and it has been there for quite a while. It breaks half-open connection discovery for many cases since version 1.15 of netinet/tcp_input.c (committed by Garrett Wollman, which is why this is Cc'd to him), although that isn't where the (presumably) incorrect behavior was introduced.
The half-open connection discovery problem can be reproduced easily, the conditions required are: - Machine A thinks it has an established connection with machine B - Machine B disagrees (it has crashed, the network has been down, maybe has been recently assigned the IP of another machine that disconnected unnicely etc., there are a lot of conditions that can cause this) - Machine B tries to connect to machine A using the same source port number as the half-open connection - Machine B selects a sequence number below the current window expected by machine A Machine B sends a SYN, but gets nothing as a reply (it should be getting an ACK), no matter how many times it tries. Machine A will keep the connection in an established state until it tries to send data (depending on the application, this may never happen) or is timed out by keepalives. This is particularly nasty if the boot procedure of machine B establishes a TCP connection to machine A - after a crash, it'll always try to use the same port number and never succeed. Basically, in the tcp_input function, just before ACK processing, when 'goto drop' is done if ACK isn't set, TF_ACKNOW might be set in tp->t_flags, but the ACK is never sent because tcp_output is never called. This can be fixed by checking for TF_ACKNOW in the drop: case and calling tcp_output if it is set. However, such a modification can change the behavior of a considerable number of cases so I think it needs careful verification. Anyone who knows the TCP code, please comment! To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message