Hi,

When doing shutdown on Tegra20/Tegra30, we need to read/write PMIC registers 
through I2C
to perform the power off sequence. Unfortunately, sometimes we'll fail to 
shutdown
due to I2C timeout on Tegra20. And the cause of the timeout is due to the CPU 
which I2C
controller IRQ affined to will have chance to be offlined without migrating all 
irqs affined 
to it, so the following I2C transactions will fail (no any CPU will handle that 
interrupt
since then).

Some snippet of the shutdown codes:

void kernel_power_off(void)
{
        kernel_shutdown_prepare(SYSTEM_POWER_OFF);
        :
        disable_nonboot_cpus();
        :
        machine_power_off();
}

void machine_power_off(void)
{
        machine_shutdown();
        if (pm_power_off)
                pm_power_off(); /* this is where we send I2C write to shutdown 
*/
}

void machine_shutdown(void)
{
#ifdef CONFIG_SMP
        smp_send_stop();
#endif
}

In "smp_send_stop()", it will send "IPI_CPU_STOPS" to offline other cpus except
current cpu (smp_processor_id()), however, current cpu will not always be cpu0 
at
least at Tegra20, that said for example cpu1 might be the current cpu and cpu0 
will
be offlined and this is the case why the I2C transaction will timeout. 

For normal case, "disable_nonboot_cpus()" call will disable all other Cpus 
except
cpu0, that means we won't hit the problem mentioned here since cpu0 will always 
be
the current cpu in the call "smp_send_stop", but the call to 
"disable_nonboot_cpus" 
will happen only when "CONFIG_PM_SLEEP_SMP" is enabled which is not the case for
Tegra20/Tegra30, we don't support suspend yet so this can't be enabled.

There are two known fix for this, the first one is enable suspend 
(ARCH_SUSPEND_POSSIBLE)
so the cpu0 will be the only online cpu while doing "machine_shutdown". The 
second
fix is adding call to "migrate_irqs()" in "ipi_cpu_stop" so all irqs can be 
migrated to
the active cpu.

Could someone familiar with the ARM SMP design help to answer my two questions?

1. Does it make sense that "smp_processor_id()" could be non-cpu0 in the call
   "smp_send_stop()"? In Tegra30 it will always be cpu0 but Tegra20 will be 
50-50,
   I just can't find the magic.

2. If current cpu is not necessarily be cpu0 in the call "smp_send_stop()", then
   does it make sense to add "migrate_irqs()" in "ipi_cpu_stop()"? Or is there 
any
   other fix which makes more sense?

Thanks,
Bill
nvpublic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to