Hi John, On 25/09/15 18:42, John Baldwin wrote: > On Thursday, September 24, 2015 02:13:24 PM Julien Charbon wrote: >> So the issue is: >> >> - tcp_close() is called for some reasons with an inp in INP_TIMEWAIT >> state and sets the INP_DROPPED flag, >> - tcp_detach() is called when the last reference on socket is dropped >> >> then now in_pcbfree() can be called twice instead of once: >> >> 1. First in tcp_detach(): >> >> static void >> tcp_detach(struct socket *so, struct inpcb *inp) >> { >> struct tcpcb *tp; >> tp = intotcpcb(inp); >> >> if (inp->inp_flags & INP_TIMEWAIT) { >> if (inp->inp_flags & INP_DROPPED) { >> in_pcbdetach(inp); >> in_pcbfree(inp); <-- >> } >> >> 2. Second when tcptw expires here: >> >> void >> tcp_twclose(struct tcptw *tw, int reuse) >> { >> struct socket *so; >> struct inpcb *inp; >> >> inp = tw->tw_inpcb; >> >> tcp_tw_2msl_stop(tw, reuse); >> inp->inp_ppcb = NULL; >> in_pcbdrop(inp); >> >> so = inp->inp_socket; >> if (so != NULL) { >> ... >> } else { >> in_pcbfree(inp); <-- >> } >> >> This behavior is backed by Palle kernel panic backstraces and coredumps. >> >> o Solutions: >> >> Long: Forbid to call tcp_close() when inp is in INP_TIMEWAIT state, >> the TCP stack rule being: >> >> - if !INP_TIMEWAIT: Call tcp_close() >> - if INP_TIMEWAIT: Call tcp_twclose() (or call nothing, the tcptw will >> expire/be recycled anyway) >> >> Short: >> if INP_TIMEWAIT & INP_DROPPED: >> Do not call in_pcbfree(inp) in tcp_detach() unless tcptw is already >> discarded. >> >> The long solution seems cleaner, backed by tcp_detach() old comments >> and most of current tcp_close() calls but I won't take that longer path >> without -net approval first. > > I prefer the longer solution if it keeps tcp_detach() simpler by avoiding > an extra condition. Please just document it via assertions in tcp_close() > (or is this the assertion that fired and triggered the reported panic? If so, > then you obviously don't need to add it. :-P)
Thanks for your answer. And indeed tcp_detach() will be kept simpler with the longer solution, I will introduce the new assertion in tcp_close(), something like : struct tcpcb * tcp_close(struct tcpcb *tp) { ... /* * tcp_close() should not called on TIME_WAIT connections. * These connections should be either teardown using * tcp_twclose(), or left alone waiting for TIME_WAIT timeout * expiration. */ KASSERT((inpb->inp_flags & INP_TIMEWAIT) == 0, ("tcp_close cannot be called on TIME_WAIT connections")); And I will fix all paths that could lead to calling tcp_close() on TCP TIME_WAIT connections (that is why this solution will be longer). I found three potential paths so far. -- Julien
signature.asc
Description: OpenPGP digital signature