Re: FreeBSD 7.0: sockets stuck in CLOSED state...

Robert Watson Wed, 25 Jun 2008 12:35:49 -0700


On Wed, 25 Jun 2008, Ali Niknam wrote:

Recently i've been upgrading some of my machines from FreeBSD 6.x amd64 toFreeBSD 7.0 amd64.
After upgrading I noticed a weird error/bug. It seems that after severalthousand TCP connections some seem to hang in 'CLOSED' state.

Sounds like there's a bug somewhere. Before we start trying to track it down,I'll tell you a little more about how this works so that we can interpret theoutput you're seeing.

In FreeBSD, as with all UNIX/Berkeley sockets systems, each socket is actuallyrepresented by a set of data structures representing different layers ofabstraction. At the top level of struct file, representing a file descriptor.Next down is struct socket, representing a socket. Then the protocol code hasstruct inpcb, representing a generic IP connection, and struct tcpcb (orstruct tcptw once we enter TIMEWAIT), representing a TCP connection.Confusingly, these data structures don't always exist all at once. Forexample, if you close the file descriptor, freeing struct file, the socket andprotocol state may persist for some time until the TCP connection closes (alldata has been sent, or various other close modes).

One important difference between FreeBSD 6.x and FreeBSD 7.x is that, inFreeBSD 7.x, we've reduced the degree to which these data structures exist inisolation. If you look at the mailing list threads discussing the change,you'll see it described as "strengthening invariants". The most importantpart of the change was making it an invariant that so->so_pcb, the pointerfrom the socket to the protocol layer state, always remains stable and valid.This had a number of benefits: because the pointer is always stable, it nolonger requires locks to following, lowering overhead and improvingparallelism. It also simplifies the code by removing lots of error handling,and improved code stability by avoiding the inevitable bugs associated withcomplex error handling. If you look at bug reports over the years, we've hadquite a few panics reported (and fixed) in which the disappearance of protocollayer state, such as when a connection is reset while still in use by aprocess, and these are now all believed to be eliminated.

So the code is faster, cleaner, and more stable. But there are a fewinteresting side effects. One is that we retain state at the TCP layer forlonger than we used to. Specifically, if a TCP connection closes, the inpcbremains allocated until the file descriptor is closed (i.e., the applicationnotices the connection has closed and invokes close() on the file descriptor).This has a few impacts: one is that TCP connections now appear in netstat inthe CLOSED state for longer than before, and another is that open sockets thatare associated with CLOSED TCP connections now count against the globalresource limit on the number of simultaneous TCP connections.

I say "longer than before", but I should be clear that, in practice, assumingall is working properly, there's no measurable behavioral change *except* forimproved performance, cleanliness, and stability. This is becauseapplications generally open a socket, run a protocol, and when the protocolwraps up, they then close() the file descriptor in order to close theconnection.


So, with that introduction, we're interested in resolving:

(1) Is this an application bug (leaking file descriptors) that only manifests
    in 7.x due to changes in kernel state management, leading to the sockets
    being visible in netstat and counting against the resource limit?

(2) Is this a *new* bug in TCP in 7.x, perhaps a result of the state-related
    changes I've described?

(3) Is this an *old* bug in TCP that is only now manifesting because of the
    changes in kernel state management?

The first is the easiest to resolve, as all we need to do is see whether thenumber of file descriptors for the application goes upwards in an improbablemanner. You can use fstat, procstat, sockstat, or various other tools (suchas lsof) to see whether the process is leaking file descriptors. You can alsoinstrument your application to keep track of the file descriptor numbers beingreturned to see whether, perhaps, that number only goes up over time, and getsreally big.

If it turns out that your application *is* properly closing sockets, then weneed to decide if perhaps we're looking at a race in close and statemanagement. In particular, I'll need the output of "netstat -na", "vmstat-z", and "vmstat -m" from the machine once it's in its rather wedged-up state.It would be most helpful if you could actually shut down to single-user mode,killing all user processes, then waiting ten minutes, and capturing the outputof those above commands to files that you can then e-mail to me.

Without accusing you of having buggy code, I should say that I think there's areasonable chance that what you're seeing is an interaction between anexisting leak of resources in the application and the way the kernel statemanagement has changed. The output from netstat pretty precisely matches thatwhat you'd expect: lots of TCP connections in the CLOSED state reflecting aseries of connections built by an application but then not properly discarded.Likewise, when the application is killed, all of the connections go away --most likely because the file descriptors are all closed, allowing them to begarbage collected and connection state freed. If it is this sort of bug, thenmost likely you're missing a call to close() in a work loop somewhere, and insome exceptional case, you fall out of the loop without calling close().

If it turns out that you can get to single-user, wait ten minutes to make sureall the connections wind down, and there are still connections visible innetstat, then we may indeed be looking at a kernel bug, and the debugginginformation using netstat and vmstat will allow us to start to investigate.


Robert N M Watson
Computer Laboratory
University of Cambridge

netstat -n gives:
...
tcp4      0       0  1.2.3.4.*          4.5.6.7.42149       CLOSED
tcp4      39      0  1.2.3.4.*          4.5.6.7.54103       CLOSED
tcp4      35      0  1.2.3.4.*          4.5.6.7.41718       CLOSED
tcp4      38      0  1.2.3.4.*          4.5.6.7.55618       CLOSED
tcp4      41      0  1.2.3.4.*          4.5.6.7.44230       CLOSED
tcp4      39      0  1.2.3.4.*          4.5.6.7.49439       CLOSED
...
These never go away; they gradually increase and increase until theapplication starts giving errors (probably because some socket orfiledescriptor limit is reached). When the application is killed theseentries disappear.
The application in question is a self written DNS server, multithreaded, andrunning fine for years without any troubles on both BSD 5.x as well as 6.x.Also 32bits as well as 64bits on 6.x.
Ofcourse that doesn't mean that the application is error free, however, afterdoing extensive testing I really can not find anything wrong with theapplication itself, so I'm thinking maybe there's a change somewhere thatcauses this? I know that tcp/network has been completely redone...
What basically happens in the application is this:
- one main tcp thread runs an infinite while loop waiting for newconnections to arrive- as soon as one arrives a new thread is spawned that handles the newlycreated stream
- it reads some bytes, writes some bytes, then closes it
- thread exits
What appears to happen is this: after the new thread is spawned it tries toread 2 bytes (DNS tcp length information). It gets back 0 bytes (EOF) andtherefore closes the sockets and calls pthread_exit. However in netstat thatsame stream oftenly appears to have bytes 'stuck' in the in queue...
I really can't see how this can cause hanging sockets in 'CLOSED' state. Evenif the incoming queue isnt read entirely a call to close should close it.Also I really can't find any documentation in netstat, or elsewhere, aboutthe 'CLOSED' state...
Any help would greatly be appreciated!


Kind Regards,


Ali Niknam
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: FreeBSD 7.0: sockets stuck in CLOSED state...

Reply via email to