On Mon, Mar 16, 2015 at 06:53:35PM +0000, Mathieu Desnoyers wrote: > > I'm not entirely awake atm but I'm not seeing why it would need to be > > that strict; I think the current single MB on task switch is sufficient > > because if we're in the middle of schedule, userspace isn't actually > > running. > > > > So from the point of userspace the task switch is atomic. Therefore even > > if we do not get a barrier before setting ->curr, the expedited thing > > missing us doesn't matter as userspace cannot observe the difference. > > AFAIU, atomicity is not what matters here. It's more about memory ordering. > What is guaranteeing that upon entry in kernel-space, all prior memory > accesses (loads and stores) are ordered prior to following loads/stores ? > > The same applies when returning to user-space: what is guaranteeing that all > prior loads/stores are ordered before the user-space loads/stores performed > after returning to user-space ?
You're still one step ahead of me; why does this matter? Or put it another way; what can go wrong? By virtue of being in schedule() both tasks (prev and next) get an affective MB from the task switch. So even if we see the 'wrong' rq->curr, that CPU will still observe the MB by the time it gets to userspace. All of this is really only about userspace load/store ordering and the context switch already very much needs to guarantee userspace program order in the face of context switches. > > > In order to be able to dereference rq->curr->mm without holding the > > > rq->lock, do you envision we should protect task reclaim with RCU-sched ? > > > > A recent discussion had Linus suggest SLAB_DESTROY_BY_RCU, although I > > think Oleg did mention it would still be 'interesting'. I've not yet had > > time to really think about that. > > This might be an "interesting" modification. :) This could perhaps come > as an optimization later on ? Not really, again, take this for (;;) sys_membar(EXPEDITED) that'll generate horrendous rq lock contention, with or without the PRIVATE thing it'll pound a number of rq locks real bad. Typical scheduler syscalls only affect a single rq lock at a time -- the one the task is on. This one potentially pounds all of them. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/