On Fri, 2007-10-19 at 13:36 +0800, Herbert Xu wrote: > On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote: > > > > In fact this bug exists elsewhere too. For example, the network > > stack does this in net/sched/sch_generic.c: > > > > /* Wait for outstanding qdisc_run calls. */ > > while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state)) > > yield(); > > > > This has the same problem as the current synchronize_irq code. >
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index e01d576..b3b7420 100644 > --- a/net/sched/sch_generic.c > +++ b/net/sched/sch_generic.c > @@ -556,6 +556,7 @@ void dev_deactivate(struct net_device *dev) > { > struct Qdisc *qdisc; > struct sk_buff *skb; > + int running; > > spin_lock_bh(&dev->queue_lock); > qdisc = dev->qdisc; > @@ -571,12 +572,31 @@ void dev_deactivate(struct net_device *dev) > > dev_watchdog_down(dev); > > - /* Wait for outstanding dev_queue_xmit calls. */ > + /* Wait for outstanding qdisc-less dev_queue_xmit calls. */ > synchronize_rcu(); > > /* Wait for outstanding qdisc_run calls. */ > - while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state)) > - yield(); > + do { > + while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state)) > + yield(); > + Ouch!, is there really no sane locking alternative? Hashed waitqueues like for the page lock come to mind. > + /* > + * Double-check inside queue lock to ensure that all effects > + * of the queue run are visible when we return. > + */ > + spin_lock_bh(&dev->queue_lock); > + running = test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state); > + spin_unlock_bh(&dev->queue_lock); > + > + /* > + * The running flag should never be set at this point because > + * we've already set dev->qdisc to noop_qdisc *inside* the > same > + * pair of spin locks. That is, if any qdisc_run starts after > + * our initial test it should see the noop_qdisc and then > + * clear the RUNNING bit before dropping the queue lock. So > + * if it is set here then we've found a bug. > + */ > + } while (WARN_ON_ONCE(running)); > } > > void dev_init_scheduler(struct net_device *dev) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html