On Thu, Jun 4, 2015 at 1:20 AM, Simon Horman <simon.hor...@netronome.com> wrote: > Hi Dave, > > On Wed, Jun 03, 2015 at 11:38:29PM -0700, David Miller wrote: >> From: sfel...@gmail.com >> Date: Tue, 2 Jun 2015 20:43:28 -0700 >> >> > From: Scott Feldman <sfel...@gmail.com> >> > >> > v2: >> > >> > Changes based on review: >> > >> > - David Miller <da...@davemloft.net> raise problem with system_wq not >> > preserving queue order to execution order. To fix, use driver-private >> > ordered workqueue to preserve ordering of queued work. >> > >> > - Jiri Pirko <j...@resnulli.us> small change on kfree of work queue item. >> > >> > v1: >> > >> > In review of Simon's patchset "rocker: transaction fixes". it was noted >> > that rocker->neigh_tbl_next_index was unprotected in the call path below >> > and could race with other contexts calling rocker_port_ipv4_neigh(): >> >> How it rocker->neigh_tbl_next_index not protected? >> >> rocker->neigh_tbl_lock is _always_ held when it is accessed. >> >> This patch, therefore, looks like completely unnecessary complexity >> to me. Furthermore, I would completely prefer if the operation stays >> completely synchronous to the call path where the neigh operation >> occurs rather than throwing it out to a workqueue. > > What I was seeing is as follows: > > 1. rocker_port_ipv4_nh() is called via switchdev_port_obj_add() > with trans = SWITCHDEV_TRANS_PREPARE > > 2. rocker_port_ipv4_neigh() is called by rocker_neigh_update() > with trans = SWITCHDEV_TRANS_NONE. > > The call chain goes up to arp_process() via neigh_update(). > > 3. rocker_port_ipv4_nh() is called via switchdev_port_obj_add() > with trans = SWITCHDEV_TRANS_COMMIT > > #1 and #2 are guarded by rtl across those calls but > #2 is not guarded by rtnl. > > Inside both rocker_port_ipv4_nh() and rocker_port_ipv4_neigh() > neigh_tbl_lock lock is taken but it is not held across the > two calls to rocker_port_ipv4_nh within a single prepare->commit transaction. > > I can double check that the above still occurs, but I'm not aware of any > recent changes that would cause it not to occur any more.
Simon, just focusing on this rocker->neigh_tbl_next_index++ issue at the moment, I think the change below would make _rocker_neigh_add() safe to call without needing to put neigh_update into process context. It works because entry is stored between PREPARE and COMMIT, so once entry->index is assigned in PREPARE (under neigh_tbl_lock), it'll be re-used in COMMIT, even if the NONE thread also claims it's entry->index (also under neigh_tbl_lock). I know there are other issues you and Toshiaki Makita have pointed out, but let's see if we can peel the onion one layer at a time. --- a/drivers/net/ethernet/rocker/rocker.c +++ b/drivers/net/ethernet/rocker/rocker.c @@ -2904,10 +2904,10 @@ static void _rocker_neigh_add(struct rocker *rocker, enum switchdev_trans trans, struct rocker_neigh_tbl_entry *entry) { - entry->index = rocker->neigh_tbl_next_index; + if (trans != SWITCHDEV_TRANS_COMMIT) + entry->index = rocker->neigh_tbl_next_index++; if (trans == SWITCHDEV_TRANS_PREPARE) return; - rocker->neigh_tbl_next_index++; entry->ref_count++; hash_add(rocker->neigh_tbl, &entry->entry, be32_to_cpu(entry->ip_addr)); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html