On Wed, Sep 5, 2018 at 12:05 AM Vlad Buslov <vla...@mellanox.com> wrote: > > > On Tue 04 Sep 2018 at 22:41, Cong Wang <xiyou.wangc...@gmail.com> wrote: > > On Mon, Sep 3, 2018 at 1:33 PM Vlad Buslov <vla...@mellanox.com> wrote: > >> > >> > >> On Mon 03 Sep 2018 at 18:50, Cong Wang <xiyou.wangc...@gmail.com> wrote: > >> > On Mon, Sep 3, 2018 at 12:06 AM Vlad Buslov <vla...@mellanox.com> wrote: > >> >> > >> >> Action API was changed to work with actions and action_idr in > >> >> concurrency > >> >> safe manner, however tcf_del_walker() still uses actions without taking > >> >> reference to them first and deletes them directly, disregarding possible > >> >> concurrent delete. > >> >> > >> >> Change tcf_del_walker() to use tcf_idr_delete_index() that doesn't > >> >> require > >> >> caller to hold reference to action and accepts action id as argument, > >> >> instead of direct action pointer. > >> > > >> > Hmm, why doesn't tcf_del_walker() just take idrinfo->lock? At least > >> > tcf_dump_walker() already does. > >> > >> Because tcf_del_walker() calls __tcf_idr_release(), which take > >> idrinfo->lock itself (deadlock). It also calls sleeping functions like > > > > Deadlock can be easily resolved by moving the lock out. > > > > > >> tcf_action_goto_chain_fini(), so just implementing function that > >> releases action without taking idrinfo->lock is not enough. > > > > Sleeping can be resolved either by making it atomic or > > deferring it to a work queue. > > > > None of your arguments here is a blocker to locking > > idrinfo->lock. You really should focus on if it is really > > necessary to lock idrinfo->lock in tcf_del_walker(), rather > > than these details. > > > > For me, if you need idrinfo->lock for dump walker, you must > > need it for delete walker too, because deletion is a writer > > which should require stronger protection than the dumper, > > which merely a reader. > > I don't get how it is necessary. Dump walker uses pointers to actions > directly, and in order to be concurrency-safe it must either hold the
It uses the pointer in a read-only way, what you said doesn't change the fact that it is a reader. And, like other readers, it may not need to lock at all, which is a different topic. > lock or obtain reference to action. Note that del walker doesn't use the > action pointer, it only passed action id to tcf_idr_delete_index() > function, which does all the necessary locking and can deal with > potential concurrency issues (concurrent delete, etc.). This approach > also benefits from code reuse from other code paths that delete actions, > instead of implementing its own. Look at the difference below. With your change: idr_for_each_entry_ul{ spin_lock(&idrinfo->lock); idr_remove(); spin_unlock(&idrinfo->lock); } With what I suggest: spin_lock(&idrinfo->lock); idr_for_each_entry_ul{ idr_remove(); } spin_unlock(&idrinfo->lock); Isn't a concurrent tcf_idr_check_alloc() able to livelock here with your change? idr_for_each_entry_ul{ spin_lock(&idrinfo->lock); idr_remove(); spin_unlock(&idrinfo->lock); // tcf_idr_check_alloc() jumps in, // allocates next ID which can be found // by idr_get_next_ul() } // the whole loop goes _literately_ infinite... Also, idr_for_each_entry_ul() is supposed to be protected either by RCU or idrinfo->lock, no? With your change or without any change, it doesn't even have any lock after removing RTNL?