On Mon 25 Feb 2019 at 22:39, Cong Wang <xiyou.wangc...@gmail.com> wrote: > On Mon, Feb 25, 2019 at 8:11 AM Vlad Buslov <vla...@mellanox.com> wrote: >> >> >> On Fri 22 Feb 2019 at 19:32, Cong Wang <xiyou.wangc...@gmail.com> wrote: >> > >> > So if it is no longer RCU any more, why do you still use >> > rcu_dereference_protected()? That is, why not just deref it as a raw >> > pointer? > > > Any answer for this question?
I decided that since there is neither possibility of concurrent pointer assignment nor deallocation of object that it points to, most performant solution would be using rcu_dereference_protected() which is the only RCU dereference helper that doesn't use READ_ONCE. I now understand that this is confusing (and most likely doesn't provide any noticeable performance improvement anyway!) and will change this patch to use rcu_dereference_raw() as you suggest. > > >> > >> > And, I don't think I can buy your argument here. The RCU infrastructure >> > should not be changed even after your patches, the fast path is still >> > protocted by RCU read lock, while the slow path now is protected by >> > some smaller-scope locks. What makes cls_flower so unique that >> > it doesn't even need RCU here? tp->root is not reassigned but it is still >> > freed via RCU infra, that is in fl_destroy_sleepable(). >> > >> > Thanks. >> >> My cls API patch set introduced reference counting for tcf_proto >> structure. With that change tp->ops->destroy() (which calls fl_destroy() >> and fl_destroy_sleepable(), in case of flower classifier) is only called >> after last reference to tp is released. All slow path users of tp->ops >> must obtain reference to tp, so concurrent call to fl_destroy() is not >> possible. Before this change tcf_proto structure didn't have reference >> counting support and required users to obtain rtnl mutex before calling >> its ops callbacks. This was verified in flower by using rtnl_dereference >> to obtain tp->root. > > Yes, but fast path doesn't hold a refnct of tp, does it? If not, you still > rely on RCU for sync with readers. If yes, then probably RCU can be > gone. > > Now you are in a middle of the two, that is taking RCU read lock on > fast path without a refcnt, meanwhile still uses rcu_dereference on > slow paths without any lock. > > For me, you at least don't use the RCU API correctly here. > > Thanks. Yes, fast path still relies on RCU. What I meant is that slow path (cls API) now only calls tp ops after obtaining reference to tp, so there is no need to protect it from concurrent tp->ops->destroy() by means of rtnl or any other lock. I understand that using rcu_dereference_protected() is confusing in this case and will refactor this patch appropriately.