On Mon, Aug 29, 2016 at 8:55 AM, Brenden Blanco <bbla...@plumgrid.com> wrote: > On Mon, Aug 29, 2016 at 05:59:26PM +0300, Tariq Toukan wrote: >> Hi Brenden, >> >> The solution direction should be XDP specific that does not hurt the >> regular flow. > An rcu_read_lock is _already_ taken for _every_ packet. This is 1/64th of > that. >> >> On 26/08/2016 11:38 PM, Brenden Blanco wrote: >> >Depending on the preempt mode, the bpf_prog stored in xdp_prog may be >> >freed despite the use of call_rcu inside bpf_prog_put. The situation is >> >possible when running in PREEMPT_RCU=y mode, for instance, since the rcu >> >callback for destroying the bpf prog can run even during the bh handling >> >in the mlx4 rx path. >> > >> >Several options were considered before this patch was settled on: >> > >> >Add a napi_synchronize loop in mlx4_xdp_set, which would occur after all >> >of the rings are updated with the new program. >> >This approach has the disadvantage that as the number of rings >> >increases, the speed of udpate will slow down significantly due to >> >napi_synchronize's msleep(1). >> I prefer this option as it doesn't hurt the data path. A delay in a >> control command can be tolerated. >> >Add a new rcu_head in bpf_prog_aux, to be used by a new bpf_prog_put_bh. >> >The action of the bpf_prog_put_bh would be to then call bpf_prog_put >> >later. Those drivers that consume a bpf prog in a bh context (like mlx4) >> >would then use the bpf_prog_put_bh instead when the ring is up. This has >> >the problem of complexity, in maintaining proper refcnts and rcu lists, >> >and would likely be harder to review. In addition, this approach to >> >freeing must be exclusive with other frees of the bpf prog, for instance >> >a _bh prog must not be referenced from a prog array that is consumed by >> >a non-_bh prog. >> > >> >The placement of rcu_read_lock in this patch is functionally the same as >> >putting an rcu_read_lock in napi_poll. Actually doing so could be a >> >potentially controversial change, but would bring the implementation in >> >line with sk_busy_loop (though of course the nature of those two paths >> >is substantially different), and would also avoid future copy/paste >> >problems with future supporters of XDP. Still, this patch does not take >> >that opinionated option. >> So you decided to add a lock for all non-XDP flows, which are 99% of >> the cases. >> We should avoid this. > The whole point of rcu_read_lock architecture is to be taken in the fast > path. There won't be a performance impact from this patch.
+1, this is nothing at all like a spinlock and really this should be just like any other rcu like access. Brenden, tracking down how the structure is freed needed a few steps, please make sure the RCU requirements are well documented. Also, I'm still not a fan of using xchg to set the program, seems that a lock could be used in that path. Thanks, Tom