On Thu, Feb 28, 2019 at 6:53 AM Vlad Buslov <vla...@mellanox.com> wrote: > > > On Wed 27 Feb 2019 at 23:03, Cong Wang <xiyou.wangc...@gmail.com> wrote: > > On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vla...@mellanox.com> wrote: > >> > >> > >> On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangc...@gmail.com> wrote: > >> > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vla...@mellanox.com> wrote: > >> >> > >> >> Function tc_dump_chain() obtains and releases block->lock on each > >> >> iteration > >> >> of its inner loop that dumps all chains on block. Outputting chain > >> >> template > >> >> info is fast operation so locking/unlocking mutex multiple times is an > >> >> overhead when lock is highly contested. Modify tc_dump_chain() to only > >> >> obtain block->lock once and dump all chains without releasing it. > >> >> > >> >> Signed-off-by: Vlad Buslov <vla...@mellanox.com> > >> >> Suggested-by: Cong Wang <xiyou.wangc...@gmail.com> > >> > > >> > Thanks for the followup! > >> > > >> > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()? > >> > And for tc_dump_tfilter()? > >> > >> Not really. These two dump all tp filters and not just a template, which > >> is O(n) on number of filters and can be slow because it calls hw offload > >> API for each of them. Our typical use-case involves periodic filter dump > >> (to update stats) while multiple concurrent user-space threads are > >> updating filters, so it is important for them to be able to execute in > >> parallel. > > > > Hmm, but if these are read-only, you probably don't even need a > > mutex, you can just use RCU read lock to protect list iteration > > and you still can grab the refcnt in the same way. > > That is how it worked in my initial implementation. However, it doesn't > work with hw offloads because driver callbacks can sleep.
Hmm? You drop RCU read lock after grabbing the refcnt, right? If so what's the problem with sleeping?