On Mon, Aug 19, 2013 at 10:17 PM, Neil Zhang <zhan...@marvell.com> wrote:
> There is a corener case when no peripheral irqs route to secondary
> cores.
> Let's take dual core system for example, the sequence is as following:
>
>                 Core 0                          Core1
> 1.                                 set waiting bit and enter waiting loop
> 2. set waiting bit and poke core1
> 3.                                 clear poke in irq and enter safe state
> 4. set ready bit and enter ready loop
>
> Since there is no peripheral irq route to core 1, so it will stay in
> safe state forever, and core 0 will dead loop in the following code.
>         while (!cpuidle_coupled_cpus_ready(coupled)) {
>                 /* Check if any other cpus bailed out of idle. */
>                 if (!cpuidle_coupled_cpus_waiting(coupled))
>         }
>
> The solution is don't let secondary core enter safe state when it has
> already handled the poke interrupt.
>
> Signed-off-by: Neil Zhang <zhan...@marvell.com>
> Reviewed-by: Fangsuo Wu <f...@marvell.com>
> ---
>  drivers/cpuidle/coupled.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
> index 2a297f8..a37c718 100644
> --- a/drivers/cpuidle/coupled.c
> +++ b/drivers/cpuidle/coupled.c
> @@ -119,6 +119,7 @@ struct cpuidle_coupled {
>  #define CPUIDLE_COUPLED_NOT_IDLE       (-1)
>
>  static DEFINE_MUTEX(cpuidle_coupled_lock);
> +static DEFINE_PER_CPU(bool, poke_sync);
>  static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
>
>  /*
> @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info)
>  {
>         int cpu = (unsigned long)info;
>         cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask);
> +       __this_cpu_write(poke_sync, true);
>  }
>
>  /**
> @@ -473,6 +475,7 @@ retry:
>          * allowed for a single cpu.
>          */
>         while (!cpuidle_coupled_cpus_waiting(coupled)) {
> +               __this_cpu_write(poke_sync, false);
>                 if (cpuidle_coupled_clear_pokes(dev->cpu)) {
>                         cpuidle_coupled_set_not_waiting(dev->cpu, coupled);
>                         goto out;
> @@ -483,6 +486,10 @@ retry:
>                         goto out;
>                 }
>
> +               if (cpuidle_coupled_cpus_waiting(coupled)
> +                       && __this_cpu_read(poke_sync))
> +                       break;
> +
>                 entered_state = cpuidle_enter_state(dev, drv,
>                         dev->safe_state_index);
>         }
> --
> 1.7.4.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

I have a similar patch that avoids adding another check for
cpuidle_coupled_cpus_waiting, and uses the return value from
cpuidle_coupled_clear_pokes instead of adding a percpu bool.  I will
post it shortly.

Do you have a test case that can reproduce this easily?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to