Ben Greear wrote:
If my debugging code is correct, I've tracked down the
leaked neighbour structure as being referenced here:

    if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
        if (neigh->parms->mcast_probes + neigh->parms->app_probes) {
            atomic_set(&neigh->probes, neigh->parms->ucast_probes);
            neigh->nud_state     = NUD_INCOMPLETE;
***            neigh_hold(neigh, NDRK_NEIGH_TIMER);
            neigh->timer.expires = now + 1;
            add_timer(&neigh->timer);


 From looking at this code:

static int neigh_del_timer(struct neighbour *n)
{
    if ((n->nud_state & NUD_IN_TIMER) &&
        del_timer(&n->timer)) {
        neigh_release(n, NDRK_NEIGH_TIMER);
        return 1;
    }
    return 0;
}

Shouldn't we always do something similar to neigh->nud_state |= NUD_IN_TIMER
before calling the add_timer() method?

I added the neigh->nud_state |= NUD_IN_TIMER;
logic to neighbour.c, and I have not been able to reproduce
the problem removing network interfaces.  So, I think that
was the problem!

The full patch, including neighbour ref counting debugging
is available for review here:

http://www.candelatech.com/oss/neigh_ref.patch

This patch builds on top of my earlier netdev ref counting debugging
patch, and still needs some additional cleanup.  It also uses the same
malloc-using logic that the netdev code uses.

I'm also attaching a patch that just fixes the problem,
with no debugging info.  (Compiled but not tested by
itself.)

Signed-off-by Ben Greear <[EMAIL PROTECTED]>


--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1,4 +1,4 @@
-/*
+/* -*- linux-c -*-
  *	Generic address resolution entity
  *
  *	Authors:
@@ -853,6 +853,7 @@ int __neigh_event_send(struct neighbour 
 			neigh->nud_state     = NUD_INCOMPLETE;
 			neigh_hold(neigh);
 			neigh->timer.expires = now + 1;
+			neigh->nud_state |= NUD_IN_TIMER;
 			add_timer(&neigh->timer);
 		} else {
 			neigh->nud_state = NUD_FAILED;
@@ -867,6 +868,7 @@ int __neigh_event_send(struct neighbour 
 		neigh_hold(neigh);
 		neigh->nud_state = NUD_DELAY;
 		neigh->timer.expires = jiffies + neigh->parms->delay_probe_time;
+ 		neigh->nud_state |= NUD_IN_TIMER;
 		add_timer(&neigh->timer);
 	}
 
@@ -1016,6 +1018,7 @@ int neigh_update(struct neighbour *neigh
 			neigh->timer.expires = jiffies + 
 						((new & NUD_REACHABLE) ? 
 						 neigh->parms->reachable_time : 0);
+			neigh->nud_state |= NUD_IN_TIMER;
 			add_timer(&neigh->timer);
 		}
 		neigh->nud_state = new;

Reply via email to