2011/10/4 Mikolaj Golub : > > On Sat, 1 Oct 2011 14:15:45 +0800 dave jones wrote: > > dj> On Fri, Sep 30, 2011 at 9:41 PM, Robert Watson wrote: > >> > >> On Wed, 28 Sep 2011, Mikolaj Golub wrote: > >> > >>> On Mon, 26 Sep 2011 16:12:55 +0200 K. Macy wrote: > >>> > >>> KM> Sorry, didn't look at the images (limited bw), I've seen something > KM> > >>> like this before in timewait. This "can't happen" with UDP so will be KM> > >>> interested in learning more about the bug. > >>> > >>> The panic can be easily triggered by this: > >> > >> Hi: > >> > >> Just catching up on this thread. I think the analysis here is generally > >> right: in 9.0, you're much more likely to see an inpcb with its in_socket > >> pointer cleared in the hash list than in prior releases, and > >> in_pcbbind_setup() trips over this. > >> > >> However, at least on first glance (and from the perspective of invariants > >> here), I think the bug is actualy that in_pcbbind_setup() is asking > >> in_pcblookup_local() for an inpcb and then access the returned inpcb's > >> in_socket pointer without acquiring a lock on the inpcb. Structurally, it > >> can't acquire this lock for lock order reasons -- it already holds the > lock > >> on its own inpcb. Therefore, we should only access fields that are safe > to > >> follow in an inpcb when you hold a reference via the hash lock and not a > >> lock on the inpcb itself, which appears generally OK (+/-) for all the > >> fields in that clause but the t->inp_socket->so_options dereference. > >> > >> A preferred fix would cache the SO_REUSEPORT flag in an inpcb-layer field, > >> such as inp_flags2, giving us access to its value without having to walk > >> into the attached (or not) socket. > >> > >> This raises another structural question, which is whether we need a new > >> inp_foo flags field that is protected explicitly by the hash lock, and not > >> by the inpcb lock, which could hold fields relevant to address binding. I > >> don't think we need to solve that problem in this context, as a slightly > >> race on SO_REUSEPORT is likely acceptable. > >> > >> The suggested fix does perform the desired function of explicitly > detaching > >> the inpcb from the hash list before the socket is disconnected from the > >> inpcb. However, it's incomplete in that the invariant that's being broken > is > >> also relied on for other protocols (such as raw sockets). The correct > >> invariant is that inp_socket is safe to follow unconditionally if an inpcb > >> is locked and INP_DROPPED isn't set -- the bug is in "locked" not in > >> "INP_DROPPED", which is why I think this is the wrong fix, even though it > >> prevents a panic :-). > > dj> Hello Robert, > > dj> Thank you for taking your valuable time to find out the problem. > dj> Since I don't have idea about network internals, would you have a patch > dj> about this? I'd be glad to test it, thanks again. > > Here is the patch that implements what Robert suggests. > > Dave, could you test it?
Sure. Thanks for cooking the patch. Machines have been running two days now without panic. > -- > Mikolaj Golub Best regards, Dave. _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"