On Sat 02 Mar 2019 at 01:17, Cong Wang <xiyou.wangc...@gmail.com> wrote: > On Thu, Feb 28, 2019 at 7:49 AM Vlad Buslov <vla...@mellanox.com> wrote: >> >> However, if we revert NULL fixes dump will print general tp information >> for same tp twice (once correctly before dumping all filters on the tp, and >> second time when called for bogus NULL filter), like this: >> >> filter protocol ip pref 1 flower chain 0 >> filter protocol ip pref 1 flower chain 0 > > What I don't understand is how could we dump twice here, > particularly the second dump. You insert a bogus NULL filter > in case for retry? I thought you do the same like for tc actions, > where you insert an error pointer which can't be dumped.
This is not related in any way to retry. Before my changes creating a new tp and inserting of first filter on it was atomic - filter insertion either succeeded or tp was deleted before releasing rtnl lock, in case of failure. Now, in case of unlocked, classifiers dump code can see them before they have first filter inserted. Dump prints all filters on tp with ops->walk(), which calls arg->fn() (assigned to tcf_node_dump() in this case) on every filter. Some buggy classifiers call arg->fn() with NULL filter pointer if their ops->walk() is called before inserting any filters. This behavior caused problems in my tcf_proto_is_empty() function, but all ops->walk users can be similarly affected when not protected by rtnl lock. This is not really an issue at the moment because affected classifiers (matchall and cgroup) are not marker as "unlocked", so cls API takes rtnl lock before accessing them, but I would prefer to keep ops->walk() behavior unified to prevent any further "surprises".