On Thu, Oct 26, 2017 at 09:39:26PM -0700, Eric Dumazet wrote: > On Thu, 2017-10-26 at 21:28 -0700, Cong Wang wrote: > > On Thu, Oct 26, 2017 at 9:05 PM, Eric Dumazet <eric.duma...@gmail.com> > > wrote: > > > On Thu, 2017-10-26 at 18:24 -0700, Cong Wang wrote: > > >> ... > > > > > >> On the other hand, this makes tcf_block_put() ugly and > > >> harder to understand. Since David and Eric strongly dislike > > >> adding synchronize_rcu(), this is probably the only > > >> solution that could make everyone happy. > > > > > > > > > ... > > > > > >> +static void tcf_block_put_deferred(struct work_struct *work) > > >> +{ > > >> + struct tcf_block *block = container_of(work, struct tcf_block, > > >> work); > > >> + struct tcf_chain *chain; > > >> > > >> + rtnl_lock(); > > >> /* Hold a refcnt for all chains, except 0, in case they are gone. > > >> */ > > >> list_for_each_entry(chain, &block->chain_list, list) > > >> if (chain->index) > > >> @@ -292,13 +308,27 @@ void tcf_block_put(struct tcf_block *block) > > >> list_for_each_entry(chain, &block->chain_list, list) > > >> tcf_chain_flush(chain); > > >> > > >> - /* Wait for RCU callbacks to release the reference count. */ > > >> + INIT_WORK(&block->work, tcf_block_put_final); > > >> + /* Wait for RCU callbacks to release the reference count and make > > >> + * sure their works have been queued before this. > > >> + */ > > >> rcu_barrier(); > > >> + tcf_queue_work(&block->work); > > >> + rtnl_unlock(); > > >> +} > > > > > > > > > On a loaded server, rcu_barrier() typically takes 4 ms. > > > > > > Way better than synchronize_rcu() (about 90 ms) but still an issue when > > > holding RTNL. > > > > > > We have thousands of filters, and management daemon restarts and rebuild > > > TC hierarchy from scratch. > > > > > > Simply getting rid of 1000 old filters might block RTNL for a while, or > > > maybe I misunderstood your patches. > > > > > > > Paul pointed out the same. > > > > As I replied, this rcu_barrier() is NOT added by this patchset, it is > > already > > there in current master branch. > > You added the rtnl_lock() rtnl_unlock()... > > I really do not care if hundreds of tasks (not owning rtnl) call > rcu_barrier()... > > Also we are still using a 4.3 based kernel, and no rcu_barrier() is used > in filters dismantle ( unregister_tcf_proto_ops() is not used in our > workloads ) > > Somehow something went very wrong in net/sched in recent kernels.
Would this be a good time for me to repeat my suggestion that timers be used to aggregate the work done in the workqueue handlers, thus decreasing the number of rcu_barrier() calls done under RTNL? Thanx, Paul