Re: [RFC][PATCH] sched: Start stopper early

2015-10-26 Thread Peter Zijlstra
On Mon, Oct 26, 2015 at 03:24:36PM +0100, Michael Holzheu wrote: > On Fri, 16 Oct 2015 14:01:25 +0200 > Heiko Carstens wrote: > > > On Fri, Oct 16, 2015 at 11:57:06AM +0200, Peter Zijlstra wrote: > > > On Fri, Oct 16, 2015 at 10:22:12AM +0200, Heiko Carstens wrote: > > > > So, actually this doesn

Re: [RFC][PATCH] sched: Start stopper early

2015-10-26 Thread Michael Holzheu
On Fri, 16 Oct 2015 14:01:25 +0200 Heiko Carstens wrote: > On Fri, Oct 16, 2015 at 11:57:06AM +0200, Peter Zijlstra wrote: > > On Fri, Oct 16, 2015 at 10:22:12AM +0200, Heiko Carstens wrote: > > > So, actually this doesn't fix the bug and it _seems_ to be reproducible. > > > > > > [ FWIW, I will

Re: [RFC][PATCH] sched: Start stopper early

2015-10-16 Thread Heiko Carstens
On Fri, Oct 16, 2015 at 11:57:06AM +0200, Peter Zijlstra wrote: > On Fri, Oct 16, 2015 at 10:22:12AM +0200, Heiko Carstens wrote: > > So, actually this doesn't fix the bug and it _seems_ to be reproducible. > > > > [ FWIW, I will be offline for the next two weeks ] > > So the series from Oleg wou

Re: [RFC][PATCH] sched: Start stopper early

2015-10-16 Thread Peter Zijlstra
On Fri, Oct 16, 2015 at 10:22:12AM +0200, Heiko Carstens wrote: > So, actually this doesn't fix the bug and it _seems_ to be reproducible. > > [ FWIW, I will be offline for the next two weeks ] So the series from Oleg would be good to try; I can make a git tree for you, or otherwise stuff the lot

Re: [RFC][PATCH] sched: Start stopper early

2015-10-16 Thread Heiko Carstens
On Wed, Oct 07, 2015 at 10:41:10AM +0200, Peter Zijlstra wrote: > Hi, > > So Heiko reported some 'interesting' fail where stop_two_cpus() got > stuck in multi_cpu_stop() with one cpu waiting for another that never > happens. > > It _looks_ like the 'other' cpu isn't running and the current best >

Re: [RFC][PATCH] sched: Start stopper early

2015-10-08 Thread Oleg Nesterov
On 10/08, Oleg Nesterov wrote: > > To avoid the confusion, let me repeat that I am not arguing with > this change, perhaps it makes sense too. > > But unless I missed something it is not really correct and can't > fix the problem. So I still think the series I sent should be > applied first. ...

Re: [RFC][PATCH] sched: Start stopper early

2015-10-08 Thread Oleg Nesterov
To avoid the confusion, let me repeat that I am not arguing with this change, perhaps it makes sense too. But unless I missed something it is not really correct and can't fix the problem. So I still think the series I sent should be applied first. On 10/07, Peter Zijlstra wrote: > static int sch

[PATCH 0/3] (Was: [RFC][PATCH] sched: Start stopper early)

2015-10-08 Thread Oleg Nesterov
On 10/07, Peter Zijlstra wrote: > > So Heiko reported some 'interesting' fail where stop_two_cpus() got > stuck in multi_cpu_stop() with one cpu waiting for another that never > happens. > > It _looks_ like the 'other' cpu isn't running and the current best > theory is that we race on cpu-up and ge

Re: Re: [RFC][PATCH] sched: Start stopper early

2015-10-07 Thread kbuild test robot
Hi Oleg, [auto build test ERROR on v4.3-rc4 -- if it's inappropriate base, please ignore] config: x86_64-randconfig-x019-201540 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): kernel/cpu.

Re: [RFC][PATCH] sched: Start stopper early

2015-10-07 Thread Oleg Nesterov
Damn sorry for noise ;) On 10/07, Oleg Nesterov wrote: > > +void stop_machine_park(int cpu) > +{ > + struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); > + > + spin_lock(&stopper->lock); > + stopper->enabled = false; > + spin_unlock(&stopper->lock); Of course, it should als

Re: [RFC][PATCH] sched: Start stopper early

2015-10-07 Thread Oleg Nesterov
On 10/07, Peter Zijlstra wrote: > > On Wed, Oct 07, 2015 at 02:30:46PM +0200, Oleg Nesterov wrote: > > On 10/07, Peter Zijlstra wrote: > > > > > > So Heiko reported some 'interesting' fail where stop_two_cpus() got > > > stuck in multi_cpu_stop() with one cpu waiting for another that never > > > ha

Re: [RFC][PATCH] sched: Start stopper early

2015-10-07 Thread Peter Zijlstra
On Wed, Oct 07, 2015 at 02:30:46PM +0200, Oleg Nesterov wrote: > On 10/07, Peter Zijlstra wrote: > > > > So Heiko reported some 'interesting' fail where stop_two_cpus() got > > stuck in multi_cpu_stop() with one cpu waiting for another that never > > happens. > > > > It _looks_ like the 'other' cpu

Re: [RFC][PATCH] sched: Start stopper early

2015-10-07 Thread Oleg Nesterov
On 10/07, Peter Zijlstra wrote: > > So Heiko reported some 'interesting' fail where stop_two_cpus() got > stuck in multi_cpu_stop() with one cpu waiting for another that never > happens. > > It _looks_ like the 'other' cpu isn't running and the current best > theory is that we race on cpu-up and ge

[RFC][PATCH] sched: Start stopper early

2015-10-07 Thread Peter Zijlstra
Hi, So Heiko reported some 'interesting' fail where stop_two_cpus() got stuck in multi_cpu_stop() with one cpu waiting for another that never happens. It _looks_ like the 'other' cpu isn't running and the current best theory is that we race on cpu-up and get the stop_two_cpus() call in before the