On Mon, Nov 11, 2013 at 05:20:22PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 11, 2013 at 03:47:11PM +0800, Michael wang wrote:
> > Hi, Fengguang
> > 
> > On 11/10/2013 06:16 PM, Fengguang Wu wrote:
> > > Greetings,
> > > 
> > > I got the below dmesg and the first bad commit is
> > 
> > I guess this will disappear when '!CONFIG_RCU_BOOST'...
> > 
> > AFAIK, if the rsp was in boost mode, we count on smpboot-thread
> > 'rcu_cpu_thread_spec' to finish the callback, which will be
> > parked before do sync-rcu inside _cpu_down(), if that was true,
> > then the sync will never finish...
> > 
> > May be some brainless fix like this?
> > 
> > 
> > 
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 63aa50d..aa24338 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int 
> > tasks_frozen)
> >                                 __func__, cpu);
> >                 goto out_release;
> >         }
> > -       smpboot_park_threads(cpu);
> >  
> >         /*
> >          * By now we've cleared cpu_active_mask, wait for all 
> > preempt-disabled
> > @@ -321,6 +320,8 @@ static int __ref _cpu_down(unsigned int cpu, int 
> > tasks_frozen)
> >  #endif
> >         synchronize_rcu();
> >  
> > +       smpboot_park_threads(cpu);
> > +
> >         /*
> >          * So now all preempt/rcu users must observe !cpu_active().
> >          */
> 
> Good thinking.. Wu did this cure stuff?

Yes, it fixed the problem.

Tested-by: Fengguang Wu <fengguang...@intel.com>


/kernel/i386-randconfig-j3-11101308/484f4e66a6a1102edf02407479f6f7632aade0f3

+--------------------------------------------------+--------------+--------------+
|                                                  | e5137b50a064 | 
484f4e66a6a1 |
+--------------------------------------------------+--------------+--------------+
| boot_successes                                   | 42           | 100         
 |
| boot_failures                                    | 58           |             
 |
| INFO:task_blocked_for_more_than_seconds          | 58           |             
 |
| Kernel_panic-not_syncing:hung_task:blocked_tasks | 58           |             
 |
+--------------------------------------------------+--------------+--------------+

/kernel/x86_64-randconfig-x4-1108/484f4e66a6a1102edf02407479f6f7632aade0f3

+------------------------------------------------------------------------------------+-----------+--------------+--------------+
|                                                                               
     | v3.12-rc7 | e5137b50a064 | 484f4e66a6a1 |
+------------------------------------------------------------------------------------+-----------+--------------+--------------+
| boot_successes                                                                
     | 59        | 34           | 100          |
| has_kernel_error_warning                                                      
     | 4         |              |              |
| BUG:kernel_early_hang_without_any_printk_output                               
     | 4         |              |              |
| boot_failures                                                                 
     | 0         | 66           |              |
| INFO:task_blocked_for_more_than_seconds                                       
     | 0         | 66           |              |
| 
INFO:NMI_handler(arch_trigger_all_cpu_backtrace_handler)took_too_long_to_run:msecs
 | 0         | 55           |              |
| Kernel_panic-not_syncing:hung_task:blocked_tasks                              
     | 0         | 66           |              |
+------------------------------------------------------------------------------------+-----------+--------------+--------------+

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to