On Tue 05 Mar 2019 at 01:57, Stefano Brivio <sbri...@redhat.com> wrote: > On Wed, 27 Feb 2019 12:12:14 +0200 > Vlad Buslov <vla...@mellanox.com> wrote: > >> Currently, all netlink protocol handlers for updating rules, actions and >> qdiscs are protected with single global rtnl lock which removes any >> possibility for parallelism. This patch set is a third step to remove >> rtnl lock dependency from TC rules update path. >> >> Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added. >> TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are >> already registered with this flag and only take rtnl lock when qdisc or >> classifier requires it. Classifiers can indicate that their ops >> callbacks don't require caller to hold rtnl lock by setting the >> TCF_PROTO_OPS_DOIT_UNLOCKED flag. The goal of this change is to refactor >> flower classifier to support unlocked execution and register it with >> unlocked flag. >> >> This patch set implements following changes to make flower classifier >> concurrency-safe: >> >> - Implement reference counting for individual filters. Change fl_get to >> take reference to filter. Implement tp->ops->put callback that was >> introduced in cls API patch set to release reference to flower filter. >> >> - Use tp->lock spinlock to protect internal classifier data structures >> from concurrent modification. >> >> - Handle concurrent tcf proto deletion by returning EAGAIN, which will >> cause cls API to retry and create new proto instance or return error >> to the user (depending on message type). >> >> - Handle concurrent insertion of filter with same priority and handle by >> returning EAGAIN, which will cause cls API to lookup filter again and >> process it accordingly to netlink message flags. >> >> - Extend flower mask with reference counting and protect masks list with >> masks_lock spinlock. >> >> - Prevent concurrent mask insertion by inserting temporary value to >> masks hash table. This is necessary because mask initialization is a >> sleeping operation and cannot be done while holding tp->lock. >> >> Both chain level and classifier level conflicts are resolved by >> returning -EAGAIN to cls API that results restart of whole operation. >> This retry mechanism is a result of fine-grained locking approach used >> in this and previous changes in series and is necessary to allow >> concurrent updates on same chain instance. Alternative approach would be >> to lock the whole chain while updating filters on any of child tp's, >> adding and removing classifier instances from the chain. However, since >> most CPU-intensive parts of filter update code are specifically in >> classifier code and its dependencies (extensions and hw offloads), such >> approach would negate most of the gains introduced by this change and >> previous changes in the series when updating same chain instance. >> >> Tcf hw offloads API is not changed by this patch set and still requires >> caller to hold rtnl lock. Refactored flower classifier tracks rtnl lock >> state by means of 'rtnl_held' flag provided by cls API and obtains the >> lock before calling hw offloads. Following patch set will lift this >> restriction and refactor cls hw offloads API to support unlocked >> execution. >> >> With these changes flower classifier is safely registered with >> TCF_PROTO_OPS_DOIT_UNLOCKED flag in last patch. >> >> Changes from V1 to V2: >> - Extend cover letter with explanation about retry mechanism. >> - Rebase on current net-next. >> - Patch 1: >> - Use rcu_dereference_raw() for tp->root dereference. >> - Update comment in fl_head_dereference(). >> - Patch 2: >> - Remove redundant check in fl_change error handling code. >> - Add empty line between error check and new handle assignment. >> - Patch 3: >> - Refactor loop in fl_get_next_filter() to improve readability. >> - Patch 4: >> - Refactor __fl_delete() to improve readability. >> - Patch 6: >> - Fix comment in fl_check_assign_mask(). >> - Patch 9: >> - Extend commit message. >> - Fix error code in comment. >> - Patch 11: >> - Fix fl_hw_replace_filter() to always release rtnl lock in error >> handlers. >> - Patch 12: >> - Don't take rtnl lock before calling __fl_destroy_filter() in >> workqueue context. >> - Extend commit message with explanation why flower still takes rtnl >> lock before calling hardware offloads API. > > FWIW, > > Reviewed-by: Stefano Brivio <sbri...@redhat.com>
Thanks for reviewing this! Regards, Vlad