----- Original Message ----- > From: "Peter Zijlstra" <pet...@infradead.org> > To: "Mathieu Desnoyers" <mathieu.desnoy...@efficios.com> > Cc: linux-kernel@vger.kernel.org, "KOSAKI Motohiro" > <kosaki.motoh...@jp.fujitsu.com>, "Steven Rostedt" > <rost...@goodmis.org>, "Paul E. McKenney" <paul...@linux.vnet.ibm.com>, > "Nicholas Miell" <nmi...@comcast.net>, > "Linus Torvalds" <torva...@linux-foundation.org>, "Ingo Molnar" > <mi...@redhat.com>, "Alan Cox" > <gno...@lxorguk.ukuu.org.uk>, "Lai Jiangshan" <la...@cn.fujitsu.com>, > "Stephen Hemminger" > <step...@networkplumber.org>, "Andrew Morton" <a...@linux-foundation.org>, > "Josh Triplett" <j...@joshtriplett.org>, > "Thomas Gleixner" <t...@linutronix.de>, "David Howells" > <dhowe...@redhat.com>, "Nick Piggin" <npig...@kernel.dk> > Sent: Monday, March 16, 2015 1:21:04 PM > Subject: Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier > (x86) (v12) > > On Mon, Mar 16, 2015 at 03:43:56PM +0000, Mathieu Desnoyers wrote: > > > On which; I absolutely hate that rq->lock thing in there. What is > > > 'wrong' with doing a lockless compare there? Other than not actually > > > being able to deref rq->curr of course, but we need to fix that anyhow. > > > > If we can make sure rq->curr deref could be done without holding the rq > > lock, then I think all we would need is to ensure that updates to rq->curr > > are surrounded by memory barriers. Therefore, we would have the following: > > > > * When a thread is scheduled out, a memory barrier would be issued before > > rq->curr is updated to the next thread task_struct. > > > > * Before a thread is scheduled in, a memory barrier needs to be issued > > after rq->curr is updated to the incoming thread. > > I'm not entirely awake atm but I'm not seeing why it would need to be > that strict; I think the current single MB on task switch is sufficient > because if we're in the middle of schedule, userspace isn't actually > running. > > So from the point of userspace the task switch is atomic. Therefore even > if we do not get a barrier before setting ->curr, the expedited thing > missing us doesn't matter as userspace cannot observe the difference.
AFAIU, atomicity is not what matters here. It's more about memory ordering. What is guaranteeing that upon entry in kernel-space, all prior memory accesses (loads and stores) are ordered prior to following loads/stores ? The same applies when returning to user-space: what is guaranteeing that all prior loads/stores are ordered before the user-space loads/stores performed after returning to user-space ? > > > In order to be able to dereference rq->curr->mm without holding the > > rq->lock, do you envision we should protect task reclaim with RCU-sched ? > > A recent discussion had Linus suggest SLAB_DESTROY_BY_RCU, although I > think Oleg did mention it would still be 'interesting'. I've not yet had > time to really think about that. This might be an "interesting" modification. :) This could perhaps come as an optimization later on ? By the way, I now remember why we start from the mm_cpumask, and then double-check the mm: using the mm_cpumask serves as an approximation of the CPUs we need to double-check. Therefore, rather than grabbing the rq lock for all CPUs, we only need to grab it for CPUs that are in the mm_cpumask. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/