On 08/30, Peter Zijlstra wrote: > On Tue, Aug 30, 2016 at 03:04:27PM +0200, Oleg Nesterov wrote: > > On 08/30, Peter Zijlstra wrote: > > > > > > /* > > > * Ensure we load p->on_rq _after_ p->state, otherwise it would > > > * be possible to, falsely, observe p->on_rq == 0 and get stuck > > > * in smp_cond_load_acquire() below. > > > * > > > * sched_ttwu_pending() try_to_wake_up() > > > * [S] p->on_rq = 1; [L] P->state > > > * UNLOCK rq->lock > > > * > > > * schedule() RMB > > > * LOCK rq->lock > > > * UNLOCK rq->lock > > > * > > > * [task p] > > > * [S] p->state = UNINTERRUPTIBLE [L] p->on_rq > > > * > > > * Pairs with the UNLOCK+LOCK on rq->lock from the > > > * last wakeup of our task and the schedule that got our task > > > * current. > > > */ > > > > Confused... how this connects to UNLOCK+LOCK on rq->lock? A LOAD can > > leak into the critical section. > > How so? That LOCK+UNLOCK which is leaky, UNLOCK+LOCK is a read/write > barrier (just not an MB because it lacks full transitivity).
Ah, I have wrongly read the "Pairs with the UNLOCK+LOCK" as "Pairs with the LOCK+UNLOCK". And didn't notice this even after I copy-and-pasted this part. > > But context switch should imply mb() we can rely on? > > Not sure it should, on x86 switch_mm does a CR3 write and that is > serializing, but switch_to() doesn't need to do anything iirc. Documentation/memory-barriers.txt says schedule() and similar imply full memory barriers. and I (wrongly?) interpreted this as if this is also true for 2 different threadds. I mean, I thought that the LOAD/STORE's done by some task can't be re-ordered with LOAD/STORE's done by another task which was running on the same CPU. Wrong? Oleg.