On Wed, Dec 20, 2017 at 3:05 PM, John Fastabend <john.fastab...@gmail.com> wrote: > On 12/20/2017 02:41 PM, Cong Wang wrote: >> On Wed, Dec 20, 2017 at 12:09 PM, John Fastabend >> <john.fastab...@gmail.com> wrote: >>> RCU grace period is needed for lockless qdiscs added in the commit >>> c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array"). >>> >>> It is needed now that qdiscs may be lockless otherwise we risk >>> free'ing a qdisc that is still in use from datapath. Additionally, >>> push list cleanup into RCU callback. Otherwise we risk the datapath >>> adding skbs during removal. >> >> What about qdisc_graft() -> dev_deactivate() -> synchronize_net() ? >> It doesn't work with your "lockless" patches? >> > > Well this is only in the 'parent == NULL' case otherwise we call > cops->graft(). Most sch_* seem to use qdisc_replace and this uses > sch_tree_lock(). > > The only converted qdisc mq and mqprio at this point don't care > though and do their own dev_deactivate/activate. So its not fixing > anything in the above mentioned commit.
Sure, removing a class does not impact the whole device, but removing the root qdisc does. After your "lockless", skb_array_consume_bh() is called in pfifo_fast_reset() and ptr_ring_cleanup() is called in pfifo_fast_destroy(), assuming skb_array is not buggy, what race do we have here with datapath? > > I still think it will need to be done eventually. If it resolves > the miniq case it seems like a good idea. Although per Jakub's comment > perhaps I pulled too much into the RCU handler. The case Jakub reported is a RCU callback missing a rcu barrier. I don't understand why you keep believing it is RCU readers on datapath. Not even to mention ingress is not affected by your "lockless" thing.