On Tue, 29 Mar 2016, Peter Zijlstra wrote:
In any case; the below (completely irrelevant patch for you) is something I would propose. It gives hb_waiter_dec() RELEASE like semantics and ensures it cannot creep into the lock sections its typically behind. Although strictly speaking I think it being inside that lock region is sufficient.
Indeed, it should be sufficient. Racing with waiter decrements (leaking into the hb critical region) is perfectly ok as the consequence is that the reader thread will simply go take the lock by not seeing the 1->0 dec, and therefore no harm done. Something that 0->1 cannot afford to rely on, obviously. So I think we can save the extra barrier for the release semantics and keep the call relaxed, as the performance penalty would be higher. Or are you referring to something else?
It also re-orders the increment in requeue to happen before we add to the list (as is the proper order)
ack to this part -- should be a separate patch. Thanks, Davidlohr