Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-15 Thread Dave Hansen
I just tested out Linus's current tree (bb33db7). It is quite happy on the large system which was exciting this issue. Thanks, Thomas! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://v

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-12 Thread Srivatsa S. Bhat
Hi Thomas, On 04/12/2013 04:29 PM, Thomas Gleixner wrote: > Srivatsa, > > On Fri, 12 Apr 2013, Srivatsa S. Bhat wrote: >> On 04/12/2013 02:17 AM, Thomas Gleixner wrote: + + /* + * Wait for p->on_rq to be reset to 0, to ensure that the per-cpu + * migration thread (which b

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-12 Thread Thomas Gleixner
Srivatsa, On Fri, 12 Apr 2013, Srivatsa S. Bhat wrote: > On 04/12/2013 02:17 AM, Thomas Gleixner wrote: > >> + > >> + /* > >> + * Wait for p->on_rq to be reset to 0, to ensure that the per-cpu > >> + * migration thread (which belongs to the stop_task sched class) > >> + * doesn't run until

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-12 Thread Peter Zijlstra
On Tue, 2013-04-09 at 16:38 +0200, Thomas Gleixner wrote: > The smpboot threads rely on the park/unpark mechanism which binds per > cpu threads on a particular core. Though the functionality is racy: > > CPU0CPU1CPU2 > unpark(T)

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Srivatsa S. Bhat
On 04/12/2013 02:17 AM, Thomas Gleixner wrote: > Srivatsa, > > On Fri, 12 Apr 2013, Srivatsa S. Bhat wrote: >> On 04/09/2013 08:08 PM, Thomas Gleixner wrote: >>> Add a new task state (TASK_PARKED) which prevents other wakeups and >>> use this state explicitely for the unpark wakeup. >>> >> >> Agai

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Thomas Gleixner
Srivatsa, On Fri, 12 Apr 2013, Srivatsa S. Bhat wrote: > On 04/09/2013 08:08 PM, Thomas Gleixner wrote: > > Add a new task state (TASK_PARKED) which prevents other wakeups and > > use this state explicitely for the unpark wakeup. > > > > Again, I think this is unnecessary. We are good as long as

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Thomas Gleixner
On Thu, 11 Apr 2013, Dave Hansen wrote: > On 04/11/2013 03:19 AM, Thomas Gleixner wrote: > > --- linux-2.6.orig/kernel/smpboot.c > > +++ linux-2.6/kernel/smpboot.c > > @@ -185,8 +185,18 @@ __smpboot_create_thread(struct smp_hotpl > > } > > get_task_struct(tsk); > > *per_cpu_ptr(ht->stor

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Srivatsa S. Bhat
On 04/09/2013 08:08 PM, Thomas Gleixner wrote: > The smpboot threads rely on the park/unpark mechanism which binds per > cpu threads on a particular core. Though the functionality is racy: > > CPU0 CPU1CPU2 > unpark(T) wake_up_proces

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Dave Hansen
On 04/11/2013 03:19 AM, Thomas Gleixner wrote: > --- linux-2.6.orig/kernel/smpboot.c > +++ linux-2.6/kernel/smpboot.c > @@ -185,8 +185,18 @@ __smpboot_create_thread(struct smp_hotpl > } > get_task_struct(tsk); > *per_cpu_ptr(ht->store, cpu) = tsk; > - if (ht->create) > -

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Thomas Gleixner
On Thu, 11 Apr 2013, Srivatsa S. Bhat wrote: > The reason why we can't get rid of the bind in the unpark code is because, > the threads are parked during CPU offline *after* calling CPU_DOWN_PREPARE. > And during CPU_DOWN_PREPARE, the scheduler removes the CPU from the > cpu_active_mask. > So on a

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Thomas Gleixner
On Thu, 11 Apr 2013, Srivatsa S. Bhat wrote: > On 04/11/2013 04:18 PM, Srivatsa S. Bhat wrote: > > On 04/11/2013 03:49 PM, Thomas Gleixner wrote: > >> Dave, > >> > >> On Wed, 10 Apr 2013, Dave Hansen wrote: > >> > >>> I think I got a full trace this time: > >>> > >>> http://sr71.net/~dave/linux/

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Thomas Gleixner
On Thu, 11 Apr 2013, Srivatsa S. Bhat wrote: > On 04/11/2013 04:18 PM, Srivatsa S. Bhat wrote: > > On 04/11/2013 03:49 PM, Thomas Gleixner wrote: > >> Dave, > >> > >> On Wed, 10 Apr 2013, Dave Hansen wrote: > >> > >>> I think I got a full trace this time: > >>> > >>> http://sr71.net/~dave/linux/b

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Srivatsa S. Bhat
On 04/11/2013 05:13 PM, Srivatsa S. Bhat wrote: [...] > So Dave, could you kindly test the below patch on mainline? > BTW, you don't need to try out any of the previous patches that I sent, just this one is good enough. Thanks! Regards, Srivatsa S. Bhat > > diff --git a/kernel/kthread.c b/ker

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Srivatsa S. Bhat
On 04/11/2013 04:18 PM, Srivatsa S. Bhat wrote: > On 04/11/2013 03:49 PM, Thomas Gleixner wrote: >> Dave, >> >> On Wed, 10 Apr 2013, Dave Hansen wrote: >> >>> I think I got a full trace this time: >>> >>> http://sr71.net/~dave/linux/bigbox-trace.1365621899.txt.gz >>> >>> The last timestamp is p

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Srivatsa S. Bhat
On 04/11/2013 03:49 PM, Thomas Gleixner wrote: > Dave, > > On Wed, 10 Apr 2013, Dave Hansen wrote: > >> I think I got a full trace this time: >> >> http://sr71.net/~dave/linux/bigbox-trace.1365621899.txt.gz >> >> The last timestamp is pretty close to the timestamp on the console: >> >> [ 207

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-11 Thread Thomas Gleixner
Dave, On Wed, 10 Apr 2013, Dave Hansen wrote: > I think I got a full trace this time: > > http://sr71.net/~dave/linux/bigbox-trace.1365621899.txt.gz > > The last timestamp is pretty close to the timestamp on the console: > > [ 2071.033434] smpboot_thread_fn(): > [ 2071.033455] smpboot_th

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-10 Thread Dave Hansen
I think I got a full trace this time: http://sr71.net/~dave/linux/bigbox-trace.1365621899.txt.gz The last timestamp is pretty close to the timestamp on the console: [ 2071.033434] smpboot_thread_fn(): [ 2071.033455] smpboot_thread_fn() cpu: 22 159 [ 2071.033470] td->cpu: 22 [ 2071.033475

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-10 Thread Thomas Gleixner
On Wed, 10 Apr 2013, Thomas Gleixner wrote: > On Tue, 9 Apr 2013, Dave Hansen wrote: > > > On 04/09/2013 12:30 PM, Thomas Gleixner wrote: > > > On Tue, 9 Apr 2013, Thomas Gleixner wrote: > > > Thought more about it and found, that the stupid binding only works > > > when the task is really desched

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-10 Thread Thomas Gleixner
On Tue, 9 Apr 2013, Dave Hansen wrote: > On 04/09/2013 12:30 PM, Thomas Gleixner wrote: > > On Tue, 9 Apr 2013, Thomas Gleixner wrote: > > Thought more about it and found, that the stupid binding only works > > when the task is really descheduled. So there is a small window left, > > which could l

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-09 Thread Dave Hansen
On 04/09/2013 01:38 PM, Dave Hansen wrote: > It oopsed in exit.c: > > https://picasaweb.google.com/lh/photo/7v_Xua9I29Rar3bBdNlLu9MTjNZETYmyPJy0liipFm0?feat=directlink This was just secondary fallout after the first BUG_ON(). This exit.c thing isn't a new issue. -- To unsubscribe from this list:

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-09 Thread Dave Hansen
On 04/09/2013 12:30 PM, Thomas Gleixner wrote: > On Tue, 9 Apr 2013, Thomas Gleixner wrote: > Thought more about it and found, that the stupid binding only works > when the task is really descheduled. So there is a small window left, > which could lead to this. Revised patch below. > > Anyway a tr

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-09 Thread Thomas Gleixner
On Tue, 9 Apr 2013, Thomas Gleixner wrote: > Dave, > > On Tue, 9 Apr 2013, Dave Hansen wrote: > > > Hey Thomas, > > > > I don't think the patch helped my case. Looks like the same BUG_ON(). > > > > I accidentally booted with possible_cpus=10 instead of 160. I wasn't > > able to trigger this i

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-09 Thread Thomas Gleixner
Dave, On Tue, 9 Apr 2013, Dave Hansen wrote: > Hey Thomas, > > I don't think the patch helped my case. Looks like the same BUG_ON(). > > I accidentally booted with possible_cpus=10 instead of 160. I wasn't > able to trigger this in that case, even repeatedly on/offlining them. > But, once I b

Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu

2013-04-09 Thread Dave Hansen
Hey Thomas, I don't think the patch helped my case. Looks like the same BUG_ON(). I accidentally booted with possible_cpus=10 instead of 160. I wasn't able to trigger this in that case, even repeatedly on/offlining them. But, once I booted with possible_cpus=160, it triggered in a jiffy. Two o