Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...

David S. Miller Fri, 14 Apr 2006 16:57:22 -0700

From: Jesse Brandeburg <[EMAIL PROTECTED]>
Date: Fri, 14 Apr 2006 15:55:10 -0700 (Pacific Daylight Time)


> I'm trying to isolate more of a reproduction case, I'll be sure to
> post if I can find anything with more detail.

I think I see the bug.

If tbench with large numbers of clients is part of what helps
reproduce it, the key might be hitting the memory limits in tcp_mem[]
and friends, or something to do with concurrent access to
sk->sk_forward_alloc.

I bet there is some race in there.

A lot of the action is in net/core/stream.c  We modify
sk->sk_forward_alloc non-atomically but that should be ok
since we ought to be holding all of the correct locks when
we hit these accesses.  But it is the first thing to audit.

Let's look at sk_stream_rfree() as that is invoked from SKB
freeing callbacks and is the most likely suspect for these
kinds of problems.

It is hooked up to the skb->destructor by sk_stream_set_owner_r() and
then invoked via __kfree_skb().

Nothing here takes any locks, and as stated above we modify
sk->sk_forward_alloc non-atomically, and this is therefore the bug.

Shit.

I'll think of how to fix this in the least invasive manner.  I also
want to search the changelog history to see if this race was always
present or if it was "introduced".

Making sk->sk_forward_alloc an atomic_t would be incredibly expensive
so I'll try to find a way to avoid that.  We may be able to just do
a bh_lock_sock()/bh_unlock_sock() around the body of sk_stream_rfree()
to fix this.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [e1000 debug] KERNEL: assertion (!sk_forward_alloc) failed...

Reply via email to