> > The second one seems to be trickier. It looks like a race wrt. PADT > > message reception. Reproducing the bug will probably require to > > generate some PADT flooding to a host that creates and releases PPPoE > > connections.
Ok I think I can see the potential race here, specifically the PADT frame is received while the pppoe interface is being deleted. (I will have a go inducing this with msleep() in the code tomorrow) 1. pppoe_flush_dev() - sk->sk_state = PPPOX_DEAD, po->pppoe_dev = NULL 2. pppoe_connect() - sk->sk_state = PPPOX_NONE, po->pppoe_dev = NULL 3. pppoe_disc_rcv() - sk->sk_state = PPPOX_ZOMBIE po->pppoe_dev = NULL 4. pppoe_release() - dev_put(po->pppoe_dev) ----> Oops Either in pppoe_disc_rcv() we add the condition: @@ -496,7 +499,8 @@ static int pppoe_disc_rcv(struct sk_buff *skb, struct net_device *dev, /* We're no longer connect at the PPPOE layer, * and must wait for ppp channel to disconnect us. */ - sk->sk_state = PPPOX_ZOMBIE; + if (sk->sk_state & PPPOX_CONNECTED) + sk->sk_state = PPPOX_ZOMBIE; } Or perhaps we remove the assumption that the state PPPOX_ZOMBIE has a non-null pppoe_dev on it. I don't know why the code isn't like the following anyway. -if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { +if (po->pppoe_dev) { dev_put(po->pppoe_dev); po->pppoe_dev = NULL; }