On Tue, Oct 17, 2017 at 5:58 PM, Chris Mi <chr...@mellanox.com> wrote: > > >> -----Original Message----- >> From: Cong Wang [mailto:xiyou.wangc...@gmail.com] >> Sent: Wednesday, October 18, 2017 12:56 AM >> To: Chris Mi <chr...@mellanox.com> >> Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>; Jamal Hadi >> Salim <j...@mojatatu.com>; Lucas Bates <luc...@mojatatu.com>; Jiri Pirko >> <j...@resnulli.us>; David Miller <da...@davemloft.net> >> Subject: Re: [patch net v3 2/4] net/sched: Use action array instead of action >> list as parameter >> >> On Mon, Oct 16, 2017 at 6:20 PM, Chris Mi <chr...@mellanox.com> wrote: >> > When destroying filters, actions should be destroyed first. >> > The pointers of each action are saved in an array. TC doesn't use the >> > array directly, but put all actions in a doubly linked list and use >> > that list as parameter. >> > >> > There is no problem if each filter has its own actions. But if some >> > filters share the same action, when these filters are destroyed, RCU >> > callback fl_destroy_filter() may be called at the same time. That >> > means the same action's 'struct list_head list' >> > could be manipulated at the same time. It may point to an invalid >> > address so that system will panic. >> >> So if we remove these RCU callbacks (by adding a sychronize_rcu) this is not >> a >> problem, right? > Maybe you are right. But do you think it will cause performance issue, I mean > it takes > longer time to destroy filters if using synchronize_rcu()?
Yeah, this is why I said it is arguable, this is slow path anyway, and RTNL lock is already a bottleneck. ;) > Or is there any other races than RCU callbacks? > We haven't found them. This is the only one we found. I wouldn't complain if this were the only case, however we already fixed at least 2 race-condition bugs because of these rcu callbacks... Take a look at this commit, all of its complexity is because of rcu callback: commit 1697c4bb5245649a23f06a144cc38c06715e1b65 Author: Cong Wang <xiyou.wangc...@gmail.com> Date: Mon Sep 11 16:33:32 2017 -0700 net_sched: carefully handle tcf_block_put() Also this one: commit c78e1746d3ad7d548bdf3fe491898cc453911a49 Author: Daniel Borkmann <dan...@iogearbox.net> Date: Wed May 20 17:13:33 2015 +0200 net: sched: fix call_rcu() race on classifier module unloads