On Sat, Oct 07, 2017 at 11:28:57AM -0700, Paul E. McKenney wrote: > But if you are saying that it would be good to have wait_for_completion() > and complete() directly modeled at some point, no argument. In addition, > I hope that the memory model is applied to other tools that analyze kernel > code.
> > I'm not sure I got the point across; so I'll try once more. Without > > providing this ordering the completion would be fundamentally broken. It > > _must_ provide this ordering. > > OK, I now understand what you are getting at, and I do very much like > that guarantee. Right, so maybe we should update the completion comments a bit to call out this property, because I'm not sure its there. Also, with this, I think the smp_store_release() / smp_load_acquire() is a perfectly fine abstraction of it, I don't think the model needs to be taught about the completion interface. > > Why not? In what sort of cases does it go wobbly? > > For one, when it conflicts with maintainability. For example, it would > probably be OK for some of RCU's rcu_node ->lock acquisitions to skip the > smp_mb__after_unlock_lock() invocations. But those are slowpaths, and the > small speedup on only one architecture is just not worth the added pain. > Especially given the nice wrapper functions that you provided. > > But of course if this were instead (say) rcu_read_lock() or common-case > rcu_read_unlock(), I would be willing to undergo much more pain. On the > other hand, for that exact reason, that common-case code path doesn't > acquire locks in the first place. ;-) Ah, so for models I would go absolutely minimal; it helps understand what the strict requirements are and where we over-provide etc.. For actual code you're entirely right, there's no point in trying to be cute with the rcu-node locks. Simplicity rules etc..