On Fri, May 31, 2019 at 10:11 AM Paul E. McKenney <paul...@linux.ibm.com> wrote: > > On Fri, May 31, 2019 at 08:45:47AM -0700, Eric Dumazet wrote: > > > > > > On 5/31/19 7:45 AM, Herbert Xu wrote: > > > On Fri, May 31, 2019 at 10:24:08AM +0200, Dmitry Vyukov wrote: > > >> > > >> OK, let's call it barrier. But we need more than a barrier here then. > > > > > > READ_ONCE/WRITE_ONCE is not some magical dust that you sprinkle > > > around in your code to make it work without locks. You need to > > > understand exactly why you need them and why the code would be > > > buggy if you don't use them. > > > > > > In this case the code doesn't need them because an implicit > > > barrier() (which is *stronger* than READ_ONCE/WRITE_ONCE) already > > > exists in both places. > > > > > > > More over, adding READ_ONCE() while not really needed prevents some compiler > > optimizations. > > > > ( Not in this particular case, since fqdir->dead is read exactly once, but > > we could > > have had a loop ) > > > > I have already explained that the READ_ONCE() was a leftover of the first > > version > > of the patch, that I refined later, adding correct (and slightly more > > complex) RCU > > barriers and rules. > > > > Dmitry, the self-documentation argument is perfectly good, but Herbert > > put much nicer ad hoc comments. > > I don't see all the code, but let me see if I understand based on the > pieces that I do see... > > o fqdir_exit() does a store-release to ->dead, then arranges > for fqdir_rwork_fn() to be called from workqueue context > after a grace period has elapsed. > > o If inet_frag_kill() is invoked only from fqdir_rwork_fn(), > and if they are using the same fqdir, then inet_frag_kill() > would always see fqdir->dead==true. > > But then it would not be necessary to check it, so this seems > unlikely >
Nope, inet_frag_kill() can be called from timer handler, and there is already an existing barrier (spinlock) before we call it (also under rcu_read_lock()) ip_expire(struct timer_list *t) rcu_read_lock(); spin_lock(&qp->q.lock); ... ipq_kill(qp); -> inet_frag_kill() > o If fqdir_exit() does store-releases to a number of ->dead > fields under rcu_read_lock(), and if the next fqdir_exit() > won't happen until after all the callbacks complete > (combination of flushing workqueues and rcu_barrier(), for > example), then ->dead would be stable when inet_frag_kill() > is invoked, and might be true or not. (This again requires > inet_frag_kill() be only invoked from fqdir_rwork_fn().) > > So I can imagine cases where this would in fact work. But did I get > it right or is something else happening? > > Thanx, Paul >