Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier (x86) (v12)

Mathieu Desnoyers Mon, 16 Mar 2015 11:56:04 -0700

----- Original Message -----
> From: "Peter Zijlstra" <pet...@infradead.org>
> To: "Mathieu Desnoyers" <mathieu.desnoy...@efficios.com>
> Cc: linux-kernel@vger.kernel.org, "KOSAKI Motohiro" 
> <kosaki.motoh...@jp.fujitsu.com>, "Steven Rostedt"
> <rost...@goodmis.org>, "Paul E. McKenney" <paul...@linux.vnet.ibm.com>, 
> "Nicholas Miell" <nmi...@comcast.net>,
> "Linus Torvalds" <torva...@linux-foundation.org>, "Ingo Molnar" 
> <mi...@redhat.com>, "Alan Cox"
> <gno...@lxorguk.ukuu.org.uk>, "Lai Jiangshan" <la...@cn.fujitsu.com>, 
> "Stephen Hemminger"
> <step...@networkplumber.org>, "Andrew Morton" <a...@linux-foundation.org>, 
> "Josh Triplett" <j...@joshtriplett.org>,
> "Thomas Gleixner" <t...@linutronix.de>, "David Howells" 
> <dhowe...@redhat.com>, "Nick Piggin" <npig...@kernel.dk>
> Sent: Monday, March 16, 2015 1:21:04 PM
> Subject: Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier 
> (x86) (v12)
> 
> On Mon, Mar 16, 2015 at 03:43:56PM +0000, Mathieu Desnoyers wrote:
> > > On which; I absolutely hate that rq->lock thing in there. What is
> > > 'wrong' with doing a lockless compare there? Other than not actually
> > > being able to deref rq->curr of course, but we need to fix that anyhow.
> > 
> > If we can make sure rq->curr deref could be done without holding the rq
> > lock, then I think all we would need is to ensure that updates to rq->curr
> > are surrounded by memory barriers. Therefore, we would have the following:
> > 
> > * When a thread is scheduled out, a memory barrier would be issued before
> >   rq->curr is updated to the next thread task_struct.
> > 
> > * Before a thread is scheduled in, a memory barrier needs to be issued
> >   after rq->curr is updated to the incoming thread.
> 
> I'm not entirely awake atm but I'm not seeing why it would need to be
> that strict; I think the current single MB on task switch is sufficient
> because if we're in the middle of schedule, userspace isn't actually
> running.
> 
> So from the point of userspace the task switch is atomic. Therefore even
> if we do not get a barrier before setting ->curr, the expedited thing
> missing us doesn't matter as userspace cannot observe the difference.


AFAIU, atomicity is not what matters here. It's more about memory ordering.
What is guaranteeing that upon entry in kernel-space, all prior memory
accesses (loads and stores) are ordered prior to following loads/stores ?

The same applies when returning to user-space: what is guaranteeing that all
prior loads/stores are ordered before the user-space loads/stores performed
after returning to user-space ?

> 
> > In order to be able to dereference rq->curr->mm without holding the
> > rq->lock, do you envision we should protect task reclaim with RCU-sched ?
> 
> A recent discussion had Linus suggest SLAB_DESTROY_BY_RCU, although I
> think Oleg did mention it would still be 'interesting'. I've not yet had
> time to really think about that.

This might be an "interesting" modification. :) This could perhaps come
as an optimization later on ?

By the way, I now remember why we start from the mm_cpumask, and then
double-check the mm: using the mm_cpumask serves as an approximation
of the CPUs we need to double-check. Therefore, rather than grabbing
the rq lock for all CPUs, we only need to grab it for CPUs that are
in the mm_cpumask.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier (x86) (v12)

Reply via email to