On Thu, Sep 01, 2016 at 04:30:28PM -0700, Tom Herbert wrote: [...] > > Yep, but this is an unlikely condition and the critical code here is > > much smaller and it is more clear that the rcu_read_lock here meant to > > protect the ring->xdp_prog under this small xdp critical section in > > comparison to your patch where it is held across the whole RX > > function. > > Note that there is already an rcu_read_lock potentially per packet > buried in the function, if the whole function is under rcu_read_lock > then that can be removed. Yes I was aware of that, I had left it as-is since: 1. it seemed to be in an exception path and less performance sensitive to nested calls, and 2. in case some future developer removed the top-level rcu_read_lock, the finer-grained one would have been unprotected if not code reviewed carefully.
I'll instead add a note at the top pointing out the dual need for the lock, to address both yours and Saeed's comments. As a side note, when considering the idea of moving the rcu_read_lock to a more generic location (napi), I had toyed with the idea of benchmarking to see if removing the actually-fast-path use of rcu_read_lock in netif_receive_skb_internal could have any performance benefit for the universal use case (non-xdp). However, that seems completely out of scope at the moment, and only beneficial for non-standard (IMO) .configs, besides being much harder to review. It was showing up in perf at about 1-2% overhead in preempt=y kernels. > > Tom