On Wed, Apr 10, 2013 at 01:16:20PM +0200, Ingo Molnar wrote:
> 
> * Robin Holt <h...@sgi.com> wrote:
> 
> > On Mon, Apr 08, 2013 at 09:11:06AM -0700, H. Peter Anvin wrote:
> > > On 04/08/2013 08:57 AM, Ingo Molnar wrote:
> > > > 
> > > > I think the original commit:
> > > > 
> > > >   f96972f2dc63 kernel/sys.c: call disable_nonboot_cpus() in 
> > > > kernel_restart()
> > > > 
> > > > actually regressed your 1024 CPU systems, and should possibly be 
> > > > reverted or fixed 
> > > > in some other fashion - such as by migrating to the primary CPU (on 
> > > > architectures 
> > > > that require that), instead of hotplug offlining every secondary CPU on 
> > > > every 
> > > > architecture!
> > > > 
> > > > Alternatively, disable_nonboot_cpus() could perhaps be improved to down 
> > > > CPUs in 
> > > > parallel: issue the CPU-down requests to every CPU, then wait for them 
> > > > to complete 
> > > > - instead of the loop over every CPU?
> > > > 
> > > > This would be the conceptual counter part to parallel boot up of CPUs - 
> > > > something 
> > > > SGI might be interested in as well?
> > > > 
> > > 
> > > Migrating to the boot processor and then calling stop_machine() to
> > > defang any other processors should be sufficient, no?
> > > 
> > > I don't know if there is any reason to deschedule all tasks?
> > 
> > My reading of the original commit indicated that some architecture's
> > firmware needs the boot cpu to be the one initiating reboot.
> > 
> > If that is correct, then I can not see why a stop_machine() implementation
> > will not work.
> > 
> > Since this is in generic kernel code, how can I proceed?
> 
> I think rebooting on the same CPU where we booted up is something worth 
> having in 
> general, as a firmware robustness feature. (assuming the CPU in question is 
> still 
> online)
> 
> We have similar constraints in the suspend code for example - some x86 
> firmware 
> breaks if suspend related ACPI calls are not done on the boot CPU ...
> 
> So how about restoring the old "just reboot, don't shut down the others" 
> behavior, 
> extended with a "reboot on the CPU that booted up" reboot affinity logic?

Just want to be sure I am going the write direction, but in the shutdown and
reboot case, you would support something like:

diff --git a/kernel/sys.c b/kernel/sys.c
index 39c9c4a..35845c5 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -358,6 +358,18 @@ int unregister_reboot_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(unregister_reboot_notifier);
 
+void migrate_to_boot_cpu(void)
+{
+       cpumask_t *shutdown_cpu_mask;
+
+       shutdown_cpu_mask = kzalloc(sizeof(cpumask_t), GFP_KERNEL);
+       if (shutdown_cpu_mask) {
+               cpumask_set_cpu(0, shutdown_cpu_mask);
+               cpumask_and(shutdown_cpu_mask, shutdown_cpu_mask, 
cpu_online_mask);
+               set_cpus_allowed_ptr(current, shutdown_cpu_mask);
+       }
+}
+
 /**
  *     kernel_restart - reboot the system
  *     @cmd: pointer to buffer containing command to execute for restart
@@ -369,7 +381,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier);
 void kernel_restart(char *cmd)
 {
        kernel_restart_prepare(cmd);
-       disable_nonboot_cpus();
+       migrate_to_boot_cpu();
        if (!cmd)
                printk(KERN_EMERG "Restarting system.\n");
        else
@@ -413,7 +425,7 @@ void kernel_power_off(void)
        kernel_shutdown_prepare(SYSTEM_POWER_OFF);
        if (pm_power_off_prepare)
                pm_power_off_prepare();
-       disable_nonboot_cpus();
+       migrate_to_boot_cpu();
        syscore_shutdown();
        printk(KERN_EMERG "Power down.\n");
        kmsg_dump(KMSG_DUMP_POWEROFF);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to