> -----Original Message----- > From: Colin Cross [mailto:ccr...@google.com] > Sent: 2013年8月23日 5:08 > To: Neil Zhang > Cc: Rafael J. Wysocki; Daniel Lezcano; Linux PM list; lkml > Subject: Re: [PATCH] cpuidle: coupled: fix dead loop corner case > > On Mon, Aug 19, 2013 at 10:17 PM, Neil Zhang <zhan...@marvell.com> > wrote: > > There is a corener case when no peripheral irqs route to secondary > > cores. > > Let's take dual core system for example, the sequence is as following: > > > > Core 0 Core1 > > 1. set waiting bit and enter waiting > loop > > 2. set waiting bit and poke core1 > > 3. clear poke in irq and enter safe > state > > 4. set ready bit and enter ready loop > > > > Since there is no peripheral irq route to core 1, so it will stay in > > safe state forever, and core 0 will dead loop in the following code. > > while (!cpuidle_coupled_cpus_ready(coupled)) { > > /* Check if any other cpus bailed out of idle. */ > > if (!cpuidle_coupled_cpus_waiting(coupled)) > > } > > > > The solution is don't let secondary core enter safe state when it has > > already handled the poke interrupt. > > > > Signed-off-by: Neil Zhang <zhan...@marvell.com> > > Reviewed-by: Fangsuo Wu <f...@marvell.com> > > --- > > drivers/cpuidle/coupled.c | 7 +++++++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c > > index 2a297f8..a37c718 100644 > > --- a/drivers/cpuidle/coupled.c > > +++ b/drivers/cpuidle/coupled.c > > @@ -119,6 +119,7 @@ struct cpuidle_coupled { > > #define CPUIDLE_COUPLED_NOT_IDLE (-1) > > > > static DEFINE_MUTEX(cpuidle_coupled_lock); > > +static DEFINE_PER_CPU(bool, poke_sync); > > static DEFINE_PER_CPU(struct call_single_data, > > cpuidle_coupled_poke_cb); > > > > /* > > @@ -295,6 +296,7 @@ static void cpuidle_coupled_poked(void *info) { > > int cpu = (unsigned long)info; > > cpumask_clear_cpu(cpu, &cpuidle_coupled_poked_mask); > > + __this_cpu_write(poke_sync, true); > > } > > > > /** > > @@ -473,6 +475,7 @@ retry: > > * allowed for a single cpu. > > */ > > while (!cpuidle_coupled_cpus_waiting(coupled)) { > > + __this_cpu_write(poke_sync, false); > > if (cpuidle_coupled_clear_pokes(dev->cpu)) { > > cpuidle_coupled_set_not_waiting(dev->cpu, > coupled); > > goto out; > > @@ -483,6 +486,10 @@ retry: > > goto out; > > } > > > > + if (cpuidle_coupled_cpus_waiting(coupled) > > + && __this_cpu_read(poke_sync)) > > + break; > > + > > entered_state = cpuidle_enter_state(dev, drv, > > dev->safe_state_index); > > } > > -- > > 1.7.4.1 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe > > linux-kernel" in the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > I have a similar patch that avoids adding another check for > cpuidle_coupled_cpus_waiting, and uses the return value from > cpuidle_coupled_clear_pokes instead of adding a percpu bool. I will post it > shortly. > > Do you have a test case that can reproduce this easily?
It's not easy to reproduce. We only catch one time till now. Best Regards, Neil Zhang