On Thu, 2019-03-07 at 14:51 +0000, Vlad Buslov wrote: [...]
> On Thu 07 Mar 2019 at 15:56, Davide Caratti <dcara...@redhat.com> wrote: > > so, I think that the answer to your question: > > > > On Wed, 2019-02-27 at 17:50 -0800, Cong Wang wrote: > > > > > > + if (oldchain) > > > > > > + tcf_chain_put_by_act(oldchain); > > > > > > > > > > Do we need to respect RCU grace period here? > > > > is a "yes, we do". > > Now I'm trying something similar to what's done in tcf_bpf_init(), to > > release the bpf program on 'replace' operations: > > > > 365 if (res == ACT_P_CREATED) { > > 366 tcf_idr_insert(tn, *act); > > 367 } else { > > 368 /* make sure the program being replaced is no longer > > executing */ > > 369 synchronize_rcu(); > > 370 tcf_bpf_cfg_cleanup(&old); > > 371 } > > > > do you think it's worth going in this direction? > > thank you in advance! > > Hi Davide, > > Using synchronize_rcu() will impact rule update rate performance and I > don't think we really need it. Ok; consider that, on current kernel, chains are not being freed/de- refcounted at all when TC actions are updated. So, the update rate performance is going to drop anyway - because of the weight of tcf_chain_put_by_act() we are forgetting to call now. Only if synchronize_rcu() takes a number of cycles which is comparable (or much greater than) tcf_chain_put_by_act(), then it makes sense to RCU-ify a->tcf_goto_chain. > I don't see any reason why we can't just > update chain to be rcu-friendly. Data path is already rcu_read > protected, in fact it only needs chain to read rcu-pointer to tp list > when jumping to chain. So it should be enough to do the following: > > 1) Update tcf_chain_destroy() to free chain after rcu grace period. > > 2) Convert tc_action->goto_chain to be a proper rcu pointer. (mark it > with "__rcu", assign with rcu_assign_pointer(), read it with > rcu_dereference{_bh}(), etc.) it seems feasible, with some attention points: 1) replacing the 'goto chain' in the init() function will then become rcu_swap_protected(p->tcf_goto_chain, newchain, lockdep_is_held(&p->tcf_lock)); with p->tcf_lock held, and we will have to do this unconditionally also on non-update paths (it should have the same cost in CPU cycles as the rcu init / assign code). Unlike the synchronize_rcu(), that would only happen only in the update path of goto_chain actions, this is a fee that we pay in every path 2) in tcf_action_goto_chain_exec(), we would have two "cascaded" rcu_dereference(), action->chain and chain->filter. Is this design acceptable? thanks, -- davide