On Mon, Nov 11, 2013 at 05:20:22PM +0100, Peter Zijlstra wrote: > On Mon, Nov 11, 2013 at 03:47:11PM +0800, Michael wang wrote: > > Hi, Fengguang > > > > On 11/10/2013 06:16 PM, Fengguang Wu wrote: > > > Greetings, > > > > > > I got the below dmesg and the first bad commit is > > > > I guess this will disappear when '!CONFIG_RCU_BOOST'... > > > > AFAIK, if the rsp was in boost mode, we count on smpboot-thread > > 'rcu_cpu_thread_spec' to finish the callback, which will be > > parked before do sync-rcu inside _cpu_down(), if that was true, > > then the sync will never finish... > > > > May be some brainless fix like this? > > > > > > > > diff --git a/kernel/cpu.c b/kernel/cpu.c > > index 63aa50d..aa24338 100644 > > --- a/kernel/cpu.c > > +++ b/kernel/cpu.c > > @@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int > > tasks_frozen) > > __func__, cpu); > > goto out_release; > > } > > - smpboot_park_threads(cpu); > > > > /* > > * By now we've cleared cpu_active_mask, wait for all > > preempt-disabled > > @@ -321,6 +320,8 @@ static int __ref _cpu_down(unsigned int cpu, int > > tasks_frozen) > > #endif > > synchronize_rcu(); > > > > + smpboot_park_threads(cpu); > > + > > /* > > * So now all preempt/rcu users must observe !cpu_active(). > > */ > > Good thinking.. Wu did this cure stuff?
Yes, it fixed the problem. Tested-by: Fengguang Wu <fengguang...@intel.com> /kernel/i386-randconfig-j3-11101308/484f4e66a6a1102edf02407479f6f7632aade0f3 +--------------------------------------------------+--------------+--------------+ | | e5137b50a064 | 484f4e66a6a1 | +--------------------------------------------------+--------------+--------------+ | boot_successes | 42 | 100 | | boot_failures | 58 | | | INFO:task_blocked_for_more_than_seconds | 58 | | | Kernel_panic-not_syncing:hung_task:blocked_tasks | 58 | | +--------------------------------------------------+--------------+--------------+ /kernel/x86_64-randconfig-x4-1108/484f4e66a6a1102edf02407479f6f7632aade0f3 +------------------------------------------------------------------------------------+-----------+--------------+--------------+ | | v3.12-rc7 | e5137b50a064 | 484f4e66a6a1 | +------------------------------------------------------------------------------------+-----------+--------------+--------------+ | boot_successes | 59 | 34 | 100 | | has_kernel_error_warning | 4 | | | | BUG:kernel_early_hang_without_any_printk_output | 4 | | | | boot_failures | 0 | 66 | | | INFO:task_blocked_for_more_than_seconds | 0 | 66 | | | INFO:NMI_handler(arch_trigger_all_cpu_backtrace_handler)took_too_long_to_run:msecs | 0 | 55 | | | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 66 | | +------------------------------------------------------------------------------------+-----------+--------------+--------------+ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/