Re: [PATCH V2] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-07 Thread Rafael J. Wysocki
On Thursday, May 07, 2015 11:17:21 PM Preeti U Murthy wrote:
> When a CPU has to enter an idle state where tick stops, it makes a call
> to tick_broadcast_enter(). The call will fail if this CPU is the
> broadcast CPU. Today, under such a circumstance, the arch cpuidle code
> handles this CPU.  This is not convincing because not only are we not
> aware what the arch cpuidle code does, but we also do not account for
> the idle state residency time and usage of such a CPU.
> 
> This scenario can be handled better by simply asking the cpuidle
> governor to choose an idle state where in ticks do not stop. To
> accommodate this change move the setting of runqueue idle state from the
> core to the cpuidle driver, else the rq->idle_state will be set wrong.
> 
> Signed-off-by: Preeti U Murthy 
> ---
> Changes from V1: https://lkml.org/lkml/2015/5/7/24
> Rebased on the latest linux-pm/bleeding-edge
> 
>  drivers/cpuidle/cpuidle.c  |   21 +
>  drivers/cpuidle/governors/ladder.c |   13 ++---
>  drivers/cpuidle/governors/menu.c   |6 +-
>  include/linux/cpuidle.h|6 +++---
>  include/linux/sched.h  |   16 
>  kernel/sched/core.c|   17 +
>  kernel/sched/fair.c|2 +-
>  kernel/sched/idle.c|8 +---
>  kernel/sched/sched.h   |   24 
>  9 files changed, 70 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 8c24f95..b7e86f4 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "cpuidle.h"
> @@ -168,10 +169,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, 
> struct cpuidle_driver *drv,
>* CPU as a broadcast timer, this call may fail if it is not available.
>*/
>   if (broadcast && tick_broadcast_enter()) {
> - default_idle_call();
> - return -EBUSY;
> + index = cpuidle_select(drv, dev, !broadcast);

No, you can't do that.

This code path may be used by suspend-to-idle and that should not call
cpuidle_select().

What's needed here seems to be a fallback mechanism like "choose the
deepest state shallower than X and such that it won't stop the tick".
You don't really need to run a full governor for that.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/6] cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE

2015-05-07 Thread Rafael J. Wysocki
On Thursday, May 07, 2015 05:49:22 PM Preeti U Murthy wrote:
> On 05/05/2015 02:11 PM, Preeti U Murthy wrote:
> > On 05/05/2015 12:03 PM, Shilpasri G Bhat wrote:
> >> Hi Preeti,
> >>
> >> On 05/05/2015 09:30 AM, Preeti U Murthy wrote:
> >>> Hi Shilpa,
> >>>
> >>> On 05/04/2015 02:24 PM, Shilpasri G Bhat wrote:
> >>>> Re-evaluate the chip's throttled state on recieving OCC_THROTTLE
> >>>> notification by executing *throttle_check() on any one of the cpu on
> >>>> the chip. This is a sanity check to verify if we were indeed
> >>>> throttled/unthrottled after receiving OCC_THROTTLE notification.
> >>>>
> >>>> We cannot call *throttle_check() directly from the notification
> >>>> handler because we could be handling chip1's notification in chip2. So
> >>>> initiate an smp_call to execute *throttle_check(). We are irq-disabled
> >>>> in the notification handler, so use a worker thread to smp_call
> >>>> throttle_check() on any of the cpu in the chipmask.
> >>>
> >>> I see that the first patch takes care of reporting *per-chip* throttling
> >>> for pmax capping condition. But where are we taking care of reporting
> >>> "pstate set to safe" and "freq control disabled" scenarios per-chip ?
> >>>
> >>
> >> IMO let us not have "psafe" and "freq control disabled" states managed 
> >> per-chip.
> >> Because when the above two conditions occur it is likely to happen across 
> >> all
> >> chips during an OCC reset cycle. So I am setting 'throttled' to false on
> >> OCC_ACTIVE and re-verifying if it actually is the case by invoking
> >> *throttle_check().
> > 
> > Alright like I pointed in the previous reply, a comment to indicate that
> > psafe and freq control disabled conditions will fail when occ is
> > inactive and that all chips face the consequence of this will help.
> 
> From your explanation on the thread of the first patch of this series,
> this will not be required.
> 
> So,
> Reviewed-by: Preeti U Murthy 

OK, so is the whole series reviewed now?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/6] cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE

2015-05-08 Thread Rafael J. Wysocki
On Friday, May 08, 2015 09:16:44 AM Preeti U Murthy wrote:
> On 05/08/2015 02:29 AM, Rafael J. Wysocki wrote:
> > On Thursday, May 07, 2015 05:49:22 PM Preeti U Murthy wrote:
> >> On 05/05/2015 02:11 PM, Preeti U Murthy wrote:
> >>> On 05/05/2015 12:03 PM, Shilpasri G Bhat wrote:
> >>>> Hi Preeti,
> >>>>
> >>>> On 05/05/2015 09:30 AM, Preeti U Murthy wrote:
> >>>>> Hi Shilpa,
> >>>>>
> >>>>> On 05/04/2015 02:24 PM, Shilpasri G Bhat wrote:
> >>>>>> Re-evaluate the chip's throttled state on recieving OCC_THROTTLE
> >>>>>> notification by executing *throttle_check() on any one of the cpu on
> >>>>>> the chip. This is a sanity check to verify if we were indeed
> >>>>>> throttled/unthrottled after receiving OCC_THROTTLE notification.
> >>>>>>
> >>>>>> We cannot call *throttle_check() directly from the notification
> >>>>>> handler because we could be handling chip1's notification in chip2. So
> >>>>>> initiate an smp_call to execute *throttle_check(). We are irq-disabled
> >>>>>> in the notification handler, so use a worker thread to smp_call
> >>>>>> throttle_check() on any of the cpu in the chipmask.
> >>>>>
> >>>>> I see that the first patch takes care of reporting *per-chip* throttling
> >>>>> for pmax capping condition. But where are we taking care of reporting
> >>>>> "pstate set to safe" and "freq control disabled" scenarios per-chip ?
> >>>>>
> >>>>
> >>>> IMO let us not have "psafe" and "freq control disabled" states managed 
> >>>> per-chip.
> >>>> Because when the above two conditions occur it is likely to happen 
> >>>> across all
> >>>> chips during an OCC reset cycle. So I am setting 'throttled' to false on
> >>>> OCC_ACTIVE and re-verifying if it actually is the case by invoking
> >>>> *throttle_check().
> >>>
> >>> Alright like I pointed in the previous reply, a comment to indicate that
> >>> psafe and freq control disabled conditions will fail when occ is
> >>> inactive and that all chips face the consequence of this will help.
> >>
> >> From your explanation on the thread of the first patch of this series,
> >> this will not be required.
> >>
> >> So,
> >> Reviewed-by: Preeti U Murthy 
> > 
> > OK, so is the whole series reviewed now?
> 
> Yes the whole series has been reviewed.

OK, I'll queue it up for 4.2, then, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-08 Thread Rafael J. Wysocki
On Friday, May 08, 2015 01:05:32 PM Preeti U Murthy wrote:
> When a CPU has to enter an idle state where tick stops, it makes a call
> to tick_broadcast_enter(). The call will fail if this CPU is the
> broadcast CPU. Today, under such a circumstance, the arch cpuidle code
> handles this CPU.  This is not convincing because not only do we not
> know what the arch cpuidle code does, but we also do not account for the
> idle state residency time and usage of such a CPU.
> 
> This scenario can be handled better by simply choosing an idle state
> where in ticks do not stop. To accommodate this change move the setting
> of runqueue idle state from the core to the cpuidle driver, else the
> rq->idle_state will be set wrong.
> 
> Signed-off-by: Preeti U Murthy 
> ---
> Changes from V2: https://lkml.org/lkml/2015/5/7/78
> Introduce a function in cpuidle core to select an idle state where ticks do 
> not
> stop rather than going through the governors.
> 
> Changes from V1: https://lkml.org/lkml/2015/5/7/24
> Rebased on the latest linux-pm/bleeding-edge branch
> 
>  drivers/cpuidle/cpuidle.c |   45 
> +++--
>  include/linux/sched.h |   16 
>  kernel/sched/core.c   |   17 +
>  kernel/sched/fair.c   |2 +-
>  kernel/sched/idle.c   |6 --
>  kernel/sched/sched.h  |   24 
>  6 files changed, 77 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 8c24f95..d1af760 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "cpuidle.h"
> @@ -146,6 +147,36 @@ int cpuidle_enter_freeze(struct cpuidle_driver *drv, 
> struct cpuidle_device *dev)
>   return index;
>  }
>  
> +/*
> + * find_tick_valid_state - select a state where tick does not stop
> + * @dev: cpuidle device for this cpu
> + * @drv: cpuidle driver for this cpu
> + */
> +static int find_tick_valid_state(struct cpuidle_device *dev,
> + struct cpuidle_driver *drv)
> +{
> + int i, ret = -1;
> +
> + for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
> + struct cpuidle_state *s = &drv->states[i];
> + struct cpuidle_state_usage *su = &dev->states_usage[i];
> +
> + /*
> +  * We do not explicitly check for latency requirement
> +  * since it is safe to assume that only shallower idle
> +  * states will have the CPUIDLE_FLAG_TIMER_STOP bit
> +  * cleared and they will invariably meet the latency
> +  * requirement.
> +  */
> + if (s->disabled || su->disable ||
> + (s->flags & CPUIDLE_FLAG_TIMER_STOP))
> + continue;
> +
> + ret = i;
> + }
> + return ret;
> +}
> +
>  /**
>   * cpuidle_enter_state - enter the state and update stats
>   * @dev: cpuidle device for this cpu
> @@ -168,10 +199,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, 
> struct cpuidle_driver *drv,
>* CPU as a broadcast timer, this call may fail if it is not available.
>*/
>   if (broadcast && tick_broadcast_enter()) {
> - default_idle_call();
> - return -EBUSY;
> + index = find_tick_valid_state(dev, drv);

Well, the new state needs to be deeper than the old one or you may violate the
governor's choice and this doesn't guarantee that.

Also I don't quite see a reason to duplicate the find_deepest_state() 
functionality
here.

> + if (index < 0) {
> + default_idle_call();
> + return -EBUSY;
> + }
> + target_state = &drv->states[index];
>   }
>  
> + /* Take note of the planned idle state. */
> + idle_set_state(smp_processor_id(), target_state);

And I wouldn't do this either.

The behavior here is pretty much as though the driver demoted the state chosen
by the governor and we don't call idle_set_state() again in those cases.

> +
>   trace_cpu_idle_rcuidle(index, dev->cpu);
>   time_start = ktime_get();

Overall, something like the patch below (untested) should work I suppose?

---
 drivers/cpuidle/cpuidle.c |   21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/cpuidle/cpuidle.c
===
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -73,17 +73,19 @@ int cpuidle_play_dead(void)
 }
 
 static int find_deepest_state(struct cpuidle_driver *drv,
- struct cpuidle_device *dev, bool freeze)
+ struct cpuidle_device *dev, bool freeze,
+ int limit, unsigned int flags_to_avoid)
 {
unsigned int latency

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-08 Thread Rafael J. Wysocki
On Friday, May 08, 2015 04:18:02 PM Rafael J. Wysocki wrote:
> On Friday, May 08, 2015 01:05:32 PM Preeti U Murthy wrote:
> > When a CPU has to enter an idle state where tick stops, it makes a call
> > to tick_broadcast_enter(). The call will fail if this CPU is the
> > broadcast CPU. Today, under such a circumstance, the arch cpuidle code
> > handles this CPU.  This is not convincing because not only do we not
> > know what the arch cpuidle code does, but we also do not account for the
> > idle state residency time and usage of such a CPU.
> > 
> > This scenario can be handled better by simply choosing an idle state
> > where in ticks do not stop. To accommodate this change move the setting
> > of runqueue idle state from the core to the cpuidle driver, else the
> > rq->idle_state will be set wrong.
> > 
> > Signed-off-by: Preeti U Murthy 
> > ---
> > Changes from V2: https://lkml.org/lkml/2015/5/7/78
> > Introduce a function in cpuidle core to select an idle state where ticks do 
> > not
> > stop rather than going through the governors.
> > 
> > Changes from V1: https://lkml.org/lkml/2015/5/7/24
> > Rebased on the latest linux-pm/bleeding-edge branch
> > 
> >  drivers/cpuidle/cpuidle.c |   45 
> > +++--
> >  include/linux/sched.h |   16 
> >  kernel/sched/core.c   |   17 +
> >  kernel/sched/fair.c   |2 +-
> >  kernel/sched/idle.c   |6 --
> >  kernel/sched/sched.h  |   24 
> >  6 files changed, 77 insertions(+), 33 deletions(-)
> > 
> > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> > index 8c24f95..d1af760 100644
> > --- a/drivers/cpuidle/cpuidle.c
> > +++ b/drivers/cpuidle/cpuidle.c
> > @@ -21,6 +21,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  
> >  #include "cpuidle.h"
> > @@ -146,6 +147,36 @@ int cpuidle_enter_freeze(struct cpuidle_driver *drv, 
> > struct cpuidle_device *dev)
> > return index;
> >  }
> >  
> > +/*
> > + * find_tick_valid_state - select a state where tick does not stop
> > + * @dev: cpuidle device for this cpu
> > + * @drv: cpuidle driver for this cpu
> > + */
> > +static int find_tick_valid_state(struct cpuidle_device *dev,
> > +   struct cpuidle_driver *drv)
> > +{
> > +   int i, ret = -1;
> > +
> > +   for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
> > +   struct cpuidle_state *s = &drv->states[i];
> > +   struct cpuidle_state_usage *su = &dev->states_usage[i];
> > +
> > +   /*
> > +* We do not explicitly check for latency requirement
> > +* since it is safe to assume that only shallower idle
> > +* states will have the CPUIDLE_FLAG_TIMER_STOP bit
> > +* cleared and they will invariably meet the latency
> > +* requirement.
> > +*/
> > +   if (s->disabled || su->disable ||
> > +   (s->flags & CPUIDLE_FLAG_TIMER_STOP))
> > +   continue;
> > +
> > +   ret = i;
> > +   }
> > +   return ret;
> > +}
> > +
> >  /**
> >   * cpuidle_enter_state - enter the state and update stats
> >   * @dev: cpuidle device for this cpu
> > @@ -168,10 +199,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, 
> > struct cpuidle_driver *drv,
> >  * CPU as a broadcast timer, this call may fail if it is not available.
> >  */
> > if (broadcast && tick_broadcast_enter()) {
> > -   default_idle_call();
> > -   return -EBUSY;
> > +   index = find_tick_valid_state(dev, drv);
> 
> Well, the new state needs to be deeper

I should have said "shallower", sorry about that.

The state chosen by the governor satisfies certain latency requirements and we
can't violate those by choosing a deeper state here.

But the patch I sent actually did the right thing. :-)

> than the old one or you may violate the governor's choice and this doesn't
> guarantee that.
> 
> Also I don't quite see a reason to duplicate the find_deepest_state() 
> functionality
> here.
> 
> > +   if (index < 0) {
> > +   default_idle_call();
> > +   return -EBUSY;
> > +   }
> > +   target_stat

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-09 Thread Rafael J. Wysocki
On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> Hi Rafael,
> 
> On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> >> +/*
> >> + * find_tick_valid_state - select a state where tick does not stop
> >> + * @dev: cpuidle device for this cpu
> >> + * @drv: cpuidle driver for this cpu
> >> + */
> >> +static int find_tick_valid_state(struct cpuidle_device *dev,
> >> +  struct cpuidle_driver *drv)
> >> +{
> >> +  int i, ret = -1;
> >> +
> >> +  for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
> >> +  struct cpuidle_state *s = &drv->states[i];
> >> +  struct cpuidle_state_usage *su = &dev->states_usage[i];
> >> +
> >> +  /*
> >> +   * We do not explicitly check for latency requirement
> >> +   * since it is safe to assume that only shallower idle
> >> +   * states will have the CPUIDLE_FLAG_TIMER_STOP bit
> >> +   * cleared and they will invariably meet the latency
> >> +   * requirement.
> >> +   */
> >> +  if (s->disabled || su->disable ||
> >> +  (s->flags & CPUIDLE_FLAG_TIMER_STOP))
> >> +  continue;
> >> +
> >> +  ret = i;
> >> +  }
> >> +  return ret;
> >> +}
> >> +
> >>  /**
> >>   * cpuidle_enter_state - enter the state and update stats
> >>   * @dev: cpuidle device for this cpu
> >> @@ -168,10 +199,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, 
> >> struct cpuidle_driver *drv,
> >> * CPU as a broadcast timer, this call may fail if it is not available.
> >> */
> >>if (broadcast && tick_broadcast_enter()) {
> >> -  default_idle_call();
> >> -  return -EBUSY;
> >> +  index = find_tick_valid_state(dev, drv);
> > 
> > Well, the new state needs to be deeper than the old one or you may violate 
> > the
> > governor's choice and this doesn't guarantee that.
> 
> The comment above in find_tick_valid_state() explains why we are bound
> to choose a shallow idle state. I think its safe to assume that any
> state deeper than this one, would have the CPUIDLE_FLAG_TIMER_STOP flag
> set and hence would be skipped.
> 
> Your patch relies on the assumption that the idle states are arranged in
> the increasing order of exit_latency/in the order of shallow to deep.
> This is not guaranteed, is it?

No, it isn't, which is a good point.  There's no reason to rely on that
assumption, so appended is an updated version of the patch using a latency
limit instead of an index limit.

> 
> > 
> > Also I don't quite see a reason to duplicate the find_deepest_state() 
> > functionality
> > here.
> 
> Agreed. We could club them like in your patch.
> 
> > 
> >> +  if (index < 0) {
> >> +  default_idle_call();
> >> +  return -EBUSY;
> >> +  }
> >> +  target_state = &drv->states[index];
> >>}
> >>  
> >> +  /* Take note of the planned idle state. */
> >> +  idle_set_state(smp_processor_id(), target_state);
> > 
> > And I wouldn't do this either.
> > 
> > The behavior here is pretty much as though the driver demoted the state 
> > chosen
> > by the governor and we don't call idle_set_state() again in those cases.
> 
> Why is this wrong?

Because it is inconsistent, but let me reply to this in a separate message.

Anyway, it is a different problem and should be addressed by a separate
patch IMO.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-09 Thread Rafael J. Wysocki
On Saturday, May 09, 2015 08:46:20 PM Rafael J. Wysocki wrote:
> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> > Hi Rafael,
> > 
> > On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> > >> +/*
> > >> + * find_tick_valid_state - select a state where tick does not stop
> > >> + * @dev: cpuidle device for this cpu
> > >> + * @drv: cpuidle driver for this cpu
> > >> + */
> > >> +static int find_tick_valid_state(struct cpuidle_device *dev,
> > >> +struct cpuidle_driver *drv)
> > >> +{
> > >> +int i, ret = -1;
> > >> +
> > >> +for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) 
> > >> {
> > >> +struct cpuidle_state *s = &drv->states[i];
> > >> +struct cpuidle_state_usage *su = &dev->states_usage[i];
> > >> +
> > >> +/*
> > >> + * We do not explicitly check for latency requirement
> > >> + * since it is safe to assume that only shallower idle
> > >> + * states will have the CPUIDLE_FLAG_TIMER_STOP bit
> > >> + * cleared and they will invariably meet the latency
> > >> + * requirement.
> > >> + */
> > >> +if (s->disabled || su->disable ||
> > >> +(s->flags & CPUIDLE_FLAG_TIMER_STOP))
> > >> +continue;
> > >> +
> > >> +ret = i;
> > >> +}
> > >> +return ret;
> > >> +}
> > >> +
> > >>  /**
> > >>   * cpuidle_enter_state - enter the state and update stats
> > >>   * @dev: cpuidle device for this cpu
> > >> @@ -168,10 +199,17 @@ int cpuidle_enter_state(struct cpuidle_device 
> > >> *dev, struct cpuidle_driver *drv,
> > >>   * CPU as a broadcast timer, this call may fail if it is not 
> > >> available.
> > >>   */
> > >>  if (broadcast && tick_broadcast_enter()) {
> > >> -default_idle_call();
> > >> -return -EBUSY;
> > >> +index = find_tick_valid_state(dev, drv);
> > > 
> > > Well, the new state needs to be deeper than the old one or you may 
> > > violate the
> > > governor's choice and this doesn't guarantee that.
> > 
> > The comment above in find_tick_valid_state() explains why we are bound
> > to choose a shallow idle state. I think its safe to assume that any
> > state deeper than this one, would have the CPUIDLE_FLAG_TIMER_STOP flag
> > set and hence would be skipped.
> > 
> > Your patch relies on the assumption that the idle states are arranged in
> > the increasing order of exit_latency/in the order of shallow to deep.
> > This is not guaranteed, is it?
> 
> No, it isn't, which is a good point.  There's no reason to rely on that
> assumption, so appended is an updated version of the patch using a latency
> limit instead of an index limit.

And the patch *is* actually appended this time, sorry.


---
 drivers/cpuidle/cpuidle.c |   20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

Index: linux-pm/drivers/cpuidle/cpuidle.c
===
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -73,7 +73,10 @@ int cpuidle_play_dead(void)
 }
 
 static int find_deepest_state(struct cpuidle_driver *drv,
- struct cpuidle_device *dev, bool freeze)
+ struct cpuidle_device *dev,
+ unsigned int max_latency,
+ unsigned int forbidden_flags,
+ bool freeze)
 {
unsigned int latency_req = 0;
int i, ret = freeze ? -1 : CPUIDLE_DRIVER_STATE_START - 1;
@@ -83,6 +86,8 @@ static int find_deepest_state(struct cpu
struct cpuidle_state_usage *su = &dev->states_usage[i];
 
if (s->disabled || su->disable || s->exit_latency <= latency_req
+   || s->exit_latency > max_latency
+   || (s->flags & forbidden_flags)
|| (freeze && !s->enter_freeze))
continue;
 
@@ -100,7 +105,7 @@ static int find_deepest_state(struct cpu
 int cpuidle_find_deepest_state(struct cpuidle_dri

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-09 Thread Rafael J. Wysocki
On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> Hi Rafael,
> 
> On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:

[cut]

> >>  
> >> +  /* Take note of the planned idle state. */
> >> +  idle_set_state(smp_processor_id(), target_state);
> > 
> > And I wouldn't do this either.
> > 
> > The behavior here is pretty much as though the driver demoted the state 
> > chosen
> > by the governor and we don't call idle_set_state() again in those cases.
> 
> Why is this wrong?

It is not "wrong", but incomplete, because demotions done by the cpuidle driver
should also be taken into account in the same way.

But I'm seeing that the recent patch of mine that made cpuidle_enter_state()
call default_idle_call() was a mistake, because it might confuse 
find_idlest_cpu()
significantly as to what state the CPU is in.  I'll drop that one for now.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3] cpuidle: Handle tick_broadcast_enter() failure gracefully

2015-05-09 Thread Rafael J. Wysocki
On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> > Hi Rafael,
> > 
> > On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> 
> [cut]
> 
> > >>  
> > >> +/* Take note of the planned idle state. */
> > >> +idle_set_state(smp_processor_id(), target_state);
> > > 
> > > And I wouldn't do this either.
> > > 
> > > The behavior here is pretty much as though the driver demoted the state 
> > > chosen
> > > by the governor and we don't call idle_set_state() again in those cases.
> > 
> > Why is this wrong?
> 
> It is not "wrong", but incomplete, because demotions done by the cpuidle 
> driver
> should also be taken into account in the same way.
> 
> But I'm seeing that the recent patch of mine that made cpuidle_enter_state()
> call default_idle_call() was a mistake, because it might confuse 
> find_idlest_cpu()
> significantly as to what state the CPU is in.  I'll drop that one for now.

OK, done.

So after I've dropped it I think we need to do three things:
(1) Move the idle_set_state() calls to cpuidle_enter_state().
(2) Make cpuidle_enter_state() call default_idle_call() again, but this time
do that *before* it has called idle_set_state() for target_state.
(3) Introduce demotion as per my last patch.

Let me cut patches for that.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/3] cpuidle: Select a different state on tick_broadcast_enter() failures

2015-05-09 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

If tick_broadcast_enter() fails in cpuidle_enter_state(),
try to find another idle state to enter instead of invoking
default_idle_call() immediately and returning -EBUSY which
should increase the chances of saving some energy in those
cases.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpuidle/cpuidle.c |   20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

Index: linux-pm/drivers/cpuidle/cpuidle.c
===
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -73,7 +73,10 @@ int cpuidle_play_dead(void)
 }
 
 static int find_deepest_state(struct cpuidle_driver *drv,
- struct cpuidle_device *dev, bool freeze)
+ struct cpuidle_device *dev,
+ unsigned int max_latency,
+ unsigned int forbidden_flags,
+ bool freeze)
 {
unsigned int latency_req = 0;
int i, ret = freeze ? -1 : CPUIDLE_DRIVER_STATE_START - 1;
@@ -83,6 +86,8 @@ static int find_deepest_state(struct cpu
struct cpuidle_state_usage *su = &dev->states_usage[i];
 
if (s->disabled || su->disable || s->exit_latency <= latency_req
+   || s->exit_latency > max_latency
+   || (s->flags & forbidden_flags)
|| (freeze && !s->enter_freeze))
continue;
 
@@ -100,7 +105,7 @@ static int find_deepest_state(struct cpu
 int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
   struct cpuidle_device *dev)
 {
-   return find_deepest_state(drv, dev, false);
+   return find_deepest_state(drv, dev, UINT_MAX, 0, false);
 }
 
 static void enter_freeze_proper(struct cpuidle_driver *drv,
@@ -139,7 +144,7 @@ int cpuidle_enter_freeze(struct cpuidle_
 * that interrupts won't be enabled when it exits and allows the tick to
 * be frozen safely.
 */
-   index = find_deepest_state(drv, dev, true);
+   index = find_deepest_state(drv, dev, UINT_MAX, 0, true);
if (index >= 0)
enter_freeze_proper(drv, dev, index);
 
@@ -168,8 +173,13 @@ int cpuidle_enter_state(struct cpuidle_d
 * CPU as a broadcast timer, this call may fail if it is not available.
 */
if (broadcast && tick_broadcast_enter()) {
-   default_idle_call();
-   return -EBUSY;
+   index = find_deepest_state(drv, dev, target_state->exit_latency,
+  CPUIDLE_FLAG_TIMER_STOP, false);
+   if (index < 0) {
+   default_idle_call();
+   return -EBUSY;
+   }
+   target_state = &drv->states[index];
}
 
/* Take note of the planned idle state. */
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/3] sched / idle: Call idle_set_state() from cpuidle_enter_state()

2015-05-09 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Introduce a wrapper function around idle_set_state() called
sched_idle_set_state() that will pass this_rq() to it as the
first argument and make cpuidle_enter_state() call the new
function before and after entering the target state.

At the same time, remove direct invocations of idle_set_state()
from call_cpuidle().

This will allow the invocation of default_idle_call() to be
moved from call_cpuidle() to cpuidle_enter_state() safely
and call_cpuidle() to be simplified a bit as a result.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpuidle/cpuidle.c |6 ++
 include/linux/cpuidle.h   |3 +++
 kernel/sched/idle.c   |   15 +--
 3 files changed, 18 insertions(+), 6 deletions(-)

Index: linux-pm/kernel/sched/idle.c
===
--- linux-pm.orig/kernel/sched/idle.c
+++ linux-pm/kernel/sched/idle.c
@@ -15,6 +15,15 @@
 
 #include "sched.h"
 
+/**
+ * sched_idle_set_state - Record idle state for the current CPU.
+ * @idle_state: State to record.
+ */
+void sched_idle_set_state(struct cpuidle_state *idle_state)
+{
+   idle_set_state(this_rq(), idle_state);
+}
+
 static int __read_mostly cpu_idle_force_poll;
 
 void cpu_idle_poll_ctrl(bool enable)
@@ -100,9 +109,6 @@ static int call_cpuidle(struct cpuidle_d
return -EBUSY;
}
 
-   /* Take note of the planned idle state. */
-   idle_set_state(this_rq(), &drv->states[next_state]);
-
/*
 * Enter the idle state previously returned by the governor decision.
 * This function will block until an interrupt occurs and will take
@@ -110,9 +116,6 @@ static int call_cpuidle(struct cpuidle_d
 */
entered_state = cpuidle_enter(drv, dev, next_state);
 
-   /* The cpu is no longer idle or about to enter idle. */
-   idle_set_state(this_rq(), NULL);
-
if (entered_state == -EBUSY)
default_idle_call();
 
Index: linux-pm/drivers/cpuidle/cpuidle.c
===
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -170,6 +170,9 @@ int cpuidle_enter_state(struct cpuidle_d
if (broadcast && tick_broadcast_enter())
return -EBUSY;
 
+   /* Take note of the planned idle state. */
+   sched_idle_set_state(target_state);
+
trace_cpu_idle_rcuidle(index, dev->cpu);
time_start = ktime_get();
 
@@ -178,6 +181,9 @@ int cpuidle_enter_state(struct cpuidle_d
time_end = ktime_get();
trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
 
+   /* The cpu is no longer idle or about to enter idle. */
+   sched_idle_set_state(NULL);
+
if (broadcast) {
if (WARN_ON_ONCE(!irqs_disabled()))
local_irq_disable();
Index: linux-pm/include/linux/cpuidle.h
===
--- linux-pm.orig/include/linux/cpuidle.h
+++ linux-pm/include/linux/cpuidle.h
@@ -200,6 +200,9 @@ static inline struct cpuidle_driver *cpu
struct cpuidle_device *dev) {return NULL; }
 #endif
 
+/* kernel/sched/idle.c */
+extern void sched_idle_set_state(struct cpuidle_state *idle_state);
+
 #ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED
 void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev, atomic_t *a);
 #else
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/3] sched / idle: Call default_idle_call() from cpuidle_enter_state()

2015-05-09 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

The check of the cpuidle_enter() return value against -EBUSY
made in call_cpuidle() will not be necessary any more if
cpuidle_enter_state() calls default_idle_call() directly when it
is about to return -EBUSY, so make that happen and eliminate the
check.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpuidle/cpuidle.c |4 +++-
 include/linux/cpuidle.h   |1 +
 kernel/sched/idle.c   |   20 +++-
 3 files changed, 11 insertions(+), 14 deletions(-)

Index: linux-pm/drivers/cpuidle/cpuidle.c
===
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -167,8 +167,10 @@ int cpuidle_enter_state(struct cpuidle_d
 * local timer will be shut down.  If a local timer is used from another
 * CPU as a broadcast timer, this call may fail if it is not available.
 */
-   if (broadcast && tick_broadcast_enter())
+   if (broadcast && tick_broadcast_enter()) {
+   default_idle_call();
return -EBUSY;
+   }
 
/* Take note of the planned idle state. */
sched_idle_set_state(target_state);
Index: linux-pm/kernel/sched/idle.c
===
--- linux-pm.orig/kernel/sched/idle.c
+++ linux-pm/kernel/sched/idle.c
@@ -76,12 +76,13 @@ void __weak arch_cpu_idle(void)
local_irq_enable();
 }
 
-static void default_idle_call(void)
+/**
+ * default_idle_call - Default CPU idle routine.
+ *
+ * To use when the cpuidle framework cannot be used.
+ */
+void default_idle_call(void)
 {
-   /*
-* We can't use the cpuidle framework, let's use the default idle
-* routine.
-*/
if (current_clr_polling_and_test())
local_irq_enable();
else
@@ -91,8 +92,6 @@ static void default_idle_call(void)
 static int call_cpuidle(struct cpuidle_driver *drv, struct cpuidle_device *dev,
  int next_state)
 {
-   int entered_state;
-
/* Fall back to the default arch idle method on errors. */
if (next_state < 0) {
default_idle_call();
@@ -114,12 +113,7 @@ static int call_cpuidle(struct cpuidle_d
 * This function will block until an interrupt occurs and will take
 * care of re-enabling the local interrupts
 */
-   entered_state = cpuidle_enter(drv, dev, next_state);
-
-   if (entered_state == -EBUSY)
-   default_idle_call();
-
-   return entered_state;
+   return cpuidle_enter(drv, dev, next_state);
 }
 
 /**
Index: linux-pm/include/linux/cpuidle.h
===
--- linux-pm.orig/include/linux/cpuidle.h
+++ linux-pm/include/linux/cpuidle.h
@@ -202,6 +202,7 @@ static inline struct cpuidle_driver *cpu
 
 /* kernel/sched/idle.c */
 extern void sched_idle_set_state(struct cpuidle_state *idle_state);
+extern void default_idle_call(void);
 
 #ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED
 void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev, atomic_t *a);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-09 Thread Rafael J. Wysocki
On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
> > On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> > > Hi Rafael,
> > > 
> > > On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> > 
> > [cut]
> > 
> > > >>  
> > > >> +  /* Take note of the planned idle state. */
> > > >> +  idle_set_state(smp_processor_id(), target_state);
> > > > 
> > > > And I wouldn't do this either.
> > > > 
> > > > The behavior here is pretty much as though the driver demoted the state 
> > > > chosen
> > > > by the governor and we don't call idle_set_state() again in those cases.
> > > 
> > > Why is this wrong?
> > 
> > It is not "wrong", but incomplete, because demotions done by the cpuidle 
> > driver
> > should also be taken into account in the same way.
> > 
> > But I'm seeing that the recent patch of mine that made cpuidle_enter_state()
> > call default_idle_call() was a mistake, because it might confuse 
> > find_idlest_cpu()
> > significantly as to what state the CPU is in.  I'll drop that one for now.
> 
> OK, done.
> 
> So after I've dropped it I think we need to do three things:
> (1) Move the idle_set_state() calls to cpuidle_enter_state().
> (2) Make cpuidle_enter_state() call default_idle_call() again, but this time
> do that *before* it has called idle_set_state() for target_state.
> (3) Introduce demotion as per my last patch.
> 
> Let me cut patches for that.

Done as per the above and the patches follow in replies to this messge.

All on top of the current linux-next branch of the linux-pm.git tree.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-11 Thread Rafael J. Wysocki
On Monday, May 11, 2015 10:51:02 AM Preeti U Murthy wrote:
> On 05/10/2015 04:45 AM, Rafael J. Wysocki wrote:
> > On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
> >> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
> >>> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> >>>> Hi Rafael,
> >>>>
> >>>> On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> >>>
> >>> [cut]
> >>>
> >>>>>>  
> >>>>>> +  /* Take note of the planned idle state. */
> >>>>>> +  idle_set_state(smp_processor_id(), target_state);
> >>>>>
> >>>>> And I wouldn't do this either.
> >>>>>
> >>>>> The behavior here is pretty much as though the driver demoted the state 
> >>>>> chosen
> >>>>> by the governor and we don't call idle_set_state() again in those cases.
> >>>>
> >>>> Why is this wrong?
> >>>
> >>> It is not "wrong", but incomplete, because demotions done by the cpuidle 
> >>> driver
> >>> should also be taken into account in the same way.
> >>>
> >>> But I'm seeing that the recent patch of mine that made 
> >>> cpuidle_enter_state()
> >>> call default_idle_call() was a mistake, because it might confuse 
> >>> find_idlest_cpu()
> >>> significantly as to what state the CPU is in.  I'll drop that one for now.
> >>
> >> OK, done.
> >>
> >> So after I've dropped it I think we need to do three things:
> >> (1) Move the idle_set_state() calls to cpuidle_enter_state().
> >> (2) Make cpuidle_enter_state() call default_idle_call() again, but this 
> >> time
> >> do that *before* it has called idle_set_state() for target_state.
> >> (3) Introduce demotion as per my last patch.
> >>
> >> Let me cut patches for that.
> > 
> > Done as per the above and the patches follow in replies to this messge.
> > 
> > All on top of the current linux-next branch of the linux-pm.git tree.
> 
> The patches look good. Based and tested these patches on top of
> linux-pm/linux-next (They are not yet in the branch as far as I can see.)

They aren't in the tree yet.  I'll put them in there later today.

> All patches in this series
> Reviewed and Tested-by: Preeti U Murthy 

Thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-11 Thread Rafael J. Wysocki
On Monday, May 11, 2015 04:13:37 PM Sudeep Holla wrote:
> 
> On 10/05/15 00:15, Rafael J. Wysocki wrote:
> > On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
> >> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
> >>> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> >>>> Hi Rafael,
> >>>>
> >>>> On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> >>>
> >>> [cut]
> >>>
> >>>>>>
> >>>>>> +  /* Take note of the planned idle state. */
> >>>>>> +  idle_set_state(smp_processor_id(), target_state);
> >>>>>
> >>>>> And I wouldn't do this either.
> >>>>>
> >>>>> The behavior here is pretty much as though the driver demoted the state 
> >>>>> chosen
> >>>>> by the governor and we don't call idle_set_state() again in those cases.
> >>>>
> >>>> Why is this wrong?
> >>>
> >>> It is not "wrong", but incomplete, because demotions done by the cpuidle 
> >>> driver
> >>> should also be taken into account in the same way.
> >>>
> >>> But I'm seeing that the recent patch of mine that made 
> >>> cpuidle_enter_state()
> >>> call default_idle_call() was a mistake, because it might confuse 
> >>> find_idlest_cpu()
> >>> significantly as to what state the CPU is in.  I'll drop that one for now.
> >>
> >> OK, done.
> >>
> >> So after I've dropped it I think we need to do three things:
> >> (1) Move the idle_set_state() calls to cpuidle_enter_state().
> >> (2) Make cpuidle_enter_state() call default_idle_call() again, but this 
> >> time
> >>  do that *before* it has called idle_set_state() for target_state.
> >> (3) Introduce demotion as per my last patch.
> >>
> >> Let me cut patches for that.
> >
> > Done as per the above and the patches follow in replies to this messge.
> >
> > All on top of the current linux-next branch of the linux-pm.git tree.
> >
> 
> Tested on ARM Vexpress platforms with one of the CPU in broadcast mode
> and also with broadcast timer. So, you can add:
> 
> Tested-by: Sudeep Holla 

Thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-11 Thread Rafael J. Wysocki
On Monday, May 11, 2015 07:40:41 PM Daniel Lezcano wrote:
> On 05/10/2015 01:15 AM, Rafael J. Wysocki wrote:
> > On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
> >> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
> >>> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> >>>> Hi Rafael,
> >>>>
> >>>> On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> >>>
> >>> [cut]
> >>>
> >>>>>>
> >>>>>> +  /* Take note of the planned idle state. */
> >>>>>> +  idle_set_state(smp_processor_id(), target_state);
> >>>>>
> >>>>> And I wouldn't do this either.
> >>>>>
> >>>>> The behavior here is pretty much as though the driver demoted the state 
> >>>>> chosen
> >>>>> by the governor and we don't call idle_set_state() again in those cases.
> >>>>
> >>>> Why is this wrong?
> >>>
> >>> It is not "wrong", but incomplete, because demotions done by the cpuidle 
> >>> driver
> >>> should also be taken into account in the same way.
> >>>
> >>> But I'm seeing that the recent patch of mine that made 
> >>> cpuidle_enter_state()
> >>> call default_idle_call() was a mistake, because it might confuse 
> >>> find_idlest_cpu()
> >>> significantly as to what state the CPU is in.  I'll drop that one for now.
> >>
> >> OK, done.
> >>
> >> So after I've dropped it I think we need to do three things:
> >> (1) Move the idle_set_state() calls to cpuidle_enter_state().
> >> (2) Make cpuidle_enter_state() call default_idle_call() again, but this 
> >> time
> >>  do that *before* it has called idle_set_state() for target_state.
> >> (3) Introduce demotion as per my last patch.
> >>
> >> Let me cut patches for that.
> >
> > Done as per the above and the patches follow in replies to this messge.
> >
> > All on top of the current linux-next branch of the linux-pm.git tree.
> 
> IMO the resulting code is more and more confusing.

Why is it confusing?

What part of it is confusing?

Patches [1-2/3] simply replace https://patchwork.kernel.org/patch/6326761/
and I'm not sure why that would be confusing.

Patch [3/3] simply causes cpuidle_enter_state() to pick up a more suitable
state if tick_broadcast_enter() fails instead of returning an error code
in that case.  What exactly is confusing in that?

> Except I miss something, the tick_broadcast_enter can fail only if the 
> local timer of the current cpu is used as a broadcast timer (which is 
> the case today for PPC only).

well, why does this matter?

> The correct fix would be to tie this local timer with the cpu power 
> domain and disable the idle state powering down this domain like it was 
> done for the renesas cpuidle driver.
> 
> IOW, the cpu power domain is in use (because of its local timer), so we 
> shouldn't shut it down.
> 
> No ?

Sorry, I'm not sure what you're talking about.

The problem at hand is that tick_broadcast_enter() can fail and we need to
handle that.  If we can prevent it from ever failing, that would be awesome,
but quite honestly I don't see how to do that ATM.

> I am aware this is not easily fixable because the genpd framework is 
> incomplete and has some restrictions but I believe it is worth to have a 
> discussion. Add Kevin and Ulf in Cc.

So I'm going to queue up these patches for 4.2 and we can have a discussion
just fine regardless.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-12 Thread Rafael J. Wysocki
On Tuesday, May 12, 2015 10:41:35 AM Daniel Lezcano wrote:
> On 05/12/2015 01:31 AM, Rafael J. Wysocki wrote:
> > On Monday, May 11, 2015 07:40:41 PM Daniel Lezcano wrote:
> >> On 05/10/2015 01:15 AM, Rafael J. Wysocki wrote:
> >>> On Saturday, May 09, 2015 10:33:05 PM Rafael J. Wysocki wrote:
> >>>> On Saturday, May 09, 2015 10:11:41 PM Rafael J. Wysocki wrote:
> >>>>> On Saturday, May 09, 2015 11:19:16 AM Preeti U Murthy wrote:
> >>>>>> Hi Rafael,
> >>>>>>
> >>>>>> On 05/08/2015 07:48 PM, Rafael J. Wysocki wrote:
> >>>>>
> >>>>> [cut]
> >>>>>
> >>>>>>>>
> >>>>>>>> +/* Take note of the planned idle state. */
> >>>>>>>> +idle_set_state(smp_processor_id(), target_state);
> >>>>>>>
> >>>>>>> And I wouldn't do this either.
> >>>>>>>
> >>>>>>> The behavior here is pretty much as though the driver demoted the 
> >>>>>>> state chosen
> >>>>>>> by the governor and we don't call idle_set_state() again in those 
> >>>>>>> cases.
> >>>>>>
> >>>>>> Why is this wrong?
> >>>>>
> >>>>> It is not "wrong", but incomplete, because demotions done by the 
> >>>>> cpuidle driver
> >>>>> should also be taken into account in the same way.
> >>>>>
> >>>>> But I'm seeing that the recent patch of mine that made 
> >>>>> cpuidle_enter_state()
> >>>>> call default_idle_call() was a mistake, because it might confuse 
> >>>>> find_idlest_cpu()
> >>>>> significantly as to what state the CPU is in.  I'll drop that one for 
> >>>>> now.
> >>>>
> >>>> OK, done.
> >>>>
> >>>> So after I've dropped it I think we need to do three things:
> >>>> (1) Move the idle_set_state() calls to cpuidle_enter_state().
> >>>> (2) Make cpuidle_enter_state() call default_idle_call() again, but this 
> >>>> time
> >>>>   do that *before* it has called idle_set_state() for target_state.
> >>>> (3) Introduce demotion as per my last patch.
> >>>>
> >>>> Let me cut patches for that.
> >>>
> >>> Done as per the above and the patches follow in replies to this messge.
> >>>
> >>> All on top of the current linux-next branch of the linux-pm.git tree.
> >>
> >> IMO the resulting code is more and more confusing.
> >
> > Why is it confusing?
> >
> > What part of it is confusing?
> >
> > Patches [1-2/3] simply replace https://patchwork.kernel.org/patch/6326761/
> > and I'm not sure why that would be confusing.
> >
> > Patch [3/3] simply causes cpuidle_enter_state() to pick up a more suitable
> > state if tick_broadcast_enter() fails instead of returning an error code
> > in that case.  What exactly is confusing in that?
> >
> >> Except I miss something, the tick_broadcast_enter can fail only if the
> >> local timer of the current cpu is used as a broadcast timer (which is
> >> the case today for PPC only).
> >
> > well, why does this matter?
> >
> >> The correct fix would be to tie this local timer with the cpu power
> >> domain and disable the idle state powering down this domain like it was
> >> done for the renesas cpuidle driver.
> >>
> >> IOW, the cpu power domain is in use (because of its local timer), so we
> >> shouldn't shut it down.
> >>
> >> No ?
> >
> > Sorry, I'm not sure what you're talking about.
> >
> > The problem at hand is that tick_broadcast_enter() can fail and we need to
> > handle that.  If we can prevent it from ever failing, that would be awesome,
> > but quite honestly I don't see how to do that ATM.
> 
> Ok, sorry. Let me clarify.
> 
> You did a mechanism two years ago with pm_genpd_attach_cpuidle and 
> power_on/off. That disables a cpuidle state when a power domain is in use.
> 
> The idea I was proposing is to reuse this approach.
> 
> The logic is:
> 
> "The local timer is in use, this idle state power downs this timer, then 
> disable it".

I'm not sure it's about powering down.  Stopping rather (which may 

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-13 Thread Rafael J. Wysocki
On Wednesday, May 13, 2015 03:59:55 PM Kevin Hilman wrote:
> "Rafael J. Wysocki"  writes:
> 
> [...]
> 
> > Second, quite honestly, I don't see a connection to genpd here.
> 
> The connection with genpd is because the *reason* the timer was
> shutdown/stopped is because it shares power with the CPU, which is why
> the timer stops when the CPU hits ceratin low power states.  IOW, it's
> in the same power domain as the CPU.

Well, what if you don't have genpd on that system?  Is the problem at hand not
relevant then magically?

Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/3] cpuidle: updates related to tick_broadcast_enter() failures

2015-05-13 Thread Rafael J. Wysocki
On Wednesday, May 13, 2015 05:13:27 PM Kevin Hilman wrote:
> On Wed, May 13, 2015 at 5:16 PM, Rafael J. Wysocki  wrote:
> > On Wednesday, May 13, 2015 03:59:55 PM Kevin Hilman wrote:
> >> "Rafael J. Wysocki"  writes:
> >>
> >> [...]
> >>
> >> > Second, quite honestly, I don't see a connection to genpd here.
> >>
> >> The connection with genpd is because the *reason* the timer was
> >> shutdown/stopped is because it shares power with the CPU, which is why
> >> the timer stops when the CPU hits ceratin low power states.  IOW, it's
> >> in the same power domain as the CPU.
> >
> > Well, what if you don't have genpd on that system?  Is the problem at hand 
> > not
> > relevant then magically?
> 
> Well, if you're not using genpd to model hardware power domain
> dependencies, then yes you'll definitely need a different solution.
> 
> And, as we discussed on IRC.  If you only care about timers, and genpd
> is not in use, then $SUBJECT series is a fine approach, and I have no
> objections.  But for SoCs where there are several other things that
> share power with CPU, we need a more generic, genpd based solution,
> which it seems we're in agreement on.  And since the two approaches
> are not mutually exclusive, then I have real objections to applying
> this series.

I guess a "no" is missing in the last sentence. ;-)

> Acked-by: Kevin Hilman 

Thanks!

Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] PCI: pciehp: Convert pciehp to be builtin only, not modular

2015-05-27 Thread Rafael J. Wysocki
On Wednesday, May 27, 2015 02:31:49 PM Bjorn Helgaas wrote:
> [updated Rafael's email addr; not sure if sisk.pl still works or not]
> 
> On Wed, May 27, 2015 at 11:31:21AM -0700, Yinghai Lu wrote:
> > On Fri, Jul 26, 2013 at 5:43 AM, Yinghai Lu  wrote:
> > > On Thu, Jul 25, 2013 at 10:57 AM, Bjorn Helgaas  
> > > wrote:
> > >> Convert pciehp to be builtin only, with no module option.
> > >>
> > >> Signed-off-by: Bjorn Helgaas 
> > >> Acked-by: Rafael J. Wysocki 
> > >> ---
> > >>  drivers/pci/pcie/Kconfig |5 +
> > >>  1 file changed, 1 insertion(+), 4 deletions(-)
> > >>
> > >> diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
> > >> index 569f82f..3b94cfc 100644
> > >> --- a/drivers/pci/pcie/Kconfig
> > >> +++ b/drivers/pci/pcie/Kconfig
> > >> @@ -14,15 +14,12 @@ config PCIEPORTBUS
> > >>  # Include service Kconfig here
> > >>  #
> > >>  config HOTPLUG_PCI_PCIE
> > >> -   tristate "PCI Express Hotplug driver"
> > >> +   bool "PCI Express Hotplug driver"
> > >> depends on HOTPLUG_PCI && PCIEPORTBUS
> > >> help
> > >>   Say Y here if you have a motherboard that supports PCI Express 
> > >> Native
> > >>   Hotplug
> > >>
> > >> - To compile this driver as a module, choose M here: the
> > >> - module will be called pciehp.
> > >> -
> > >>   When in doubt, say N.
> > >>
> > >>  source "drivers/pci/pcie/aer/Kconfig"
> > >>
> > >
> > > Acked-by: Yinghai Lu 
> > 
> > Hi Bjorn,
> > 
> > Looks like we lose the option to disable pciehp after we make it as 
> > built-in.
> > 
> > Before acpiphp and pciehp could be compiled as modules, and user could
> > blacklist to disable them.
> > 
> > Now they are all built-in, but only acpiphp has acpiphp.disable to
> > disable acpiphp.
> > we don't have pciehp.disable yet.
> > 
> > Do you think if we should add pciehp.disable ?
> 
> Did you find a situation that would require pciehp.disable?  I hesitate to
> add it because if there's a problem and pciehp.disable fixes it, people
> tend to think the solution is "boot with pciehp.disable."  But the *real*
> solution is to fix whatever is broken in the kernel, so no parameter is
> needed at all.

Agreed.

For debug you can always use pcie_ports=compat and that will disable
pciehp too.

> > BTW we don't have any description for acpiphp.disable anywhere.
> 
> True.  I'll give you my opinion; Rafael may have a different one.
> 
> I don't know whether it's a good idea to add a description or not, for the
> same reason as above.  I think we should actively discourage people from
> using kernel parameters, except for debugging purposes and for some legacy
> issues where there's no way for the kernel to figure things out by itself.
> But in my opinion, acpiphp isn't in any of those categories, so I'm content
> to have the parameter present but undocumented.  It seems more likely that
> we'll hear about issues then, and we might be able to do something about
> them.

Agreed again.

Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 0/2] cpufreq/powernv: Set core pstate to a minimum just before hotplugging it out

2014-09-08 Thread Rafael J. Wysocki
On Friday, September 05, 2014 01:11:22 PM Viresh Kumar wrote:
> On 5 September 2014 13:09, Preeti U Murthy  wrote:
> > Today cpus go to winkle when they are offlined. Since it is the deepest
> > idle state that we have, it is expected to save good amount of power as 
> > compared
> > to online state, where cores can enter nap/fastsleep only which are
> > shallower idle states.
> > However we observed no powersavings with winkle as compared to nap/fastsleep
> > and traced the problem to the pstate of the core being kept at a high even
> > when the core is offline. This can keep the socket pstate high, thus burning
> > power unnecessarily. This patchset fixes this issue.
> >
> > Changes in V2: Changed smp_call_function_any() to 
> > smp_call_function_single() in Patch[2/2]
> 
> Acked-by: Viresh Kumar 

I've queued up the two patches for 3.18, thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3] cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec

2014-09-25 Thread Rafael J. Wysocki
On Thursday, September 11, 2014 05:03:22 PM Viresh Kumar wrote:
> On 11 September 2014 15:43, Shilpasri G Bhat
>  wrote:
> > This patch ensures the cpus to kexec/reboot at nominal frequency.
> > Nominal frequency is the highest cpu frequency on PowerPC at
> > which the cores can run without getting throttled.
> >
> > If the host kernel had set the cpus to a low pstate and then it
> > kexecs/reboots to a cpufreq disabled kernel it would cause the target
> > kernel to perform poorly. It will also increase the boot up time of
> > the target kernel. So set the cpus to high pstate, in this case to
> > nominal frequency before rebooting to avoid such scenarios.
> >
> > The reboot notifier will set the cpus to nominal frequncy.
> >
> > Signed-off-by: Shilpasri G Bhat 
> > Suggested-by: Viresh Kumar 
> > Reviewed-by: Preeti U Murthy 
> > ---
> > Changes v2->v3:
> > We return EBUSY when cpufreq governor tries to change the frequency
> > after rebooting is set to true. This results in console being flushed
> > with error messages indicating failed attempts to change the
> > frequency. So instead of returning EBUSY we return 0 to stop the
> > governor from changing the frequency without alerting a failure to
> > do the same on reboot, as this  is not an errorneaos condition.
> 
> Acked-by: Viresh Kumar 

Queued up for 3.18, thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/9] powerpc/powernv: Support for fastsleep and winkle

2014-09-29 Thread Rafael J. Wysocki
On Monday, September 29, 2014 03:53:06 PM Shreyas B Prabhu wrote:
> Hi,
> Any updates on this patch series?

I have a couple of patches from there in my tree it seems.  Please have a look
at linux-pm.git/linux-next and please let me know if that's the case.


> On Thursday 18 September 2014 08:41 AM, Shreyas B Prabhu wrote:
> > Hi,
> > 
> > In this patch series we use winkle for offlined cores. I successfully
> > tested the working of this with subcore functionality.
> > 
> > Test scenario was as follows:
> > 1. Set SMT mode to 1, Set subores-per-core to 1
> > 2. Offline a core, in this case cpu 32 (sending it to winkle)
> > 3. Set subcores-per-core to 4
> > 4. Online the core
> > 5. Start a guest (Topology 1 core 2 threads) on a subcore, in this case
> > on cpu 36
> > 
> > This works without any glitch.
> > 
> > Thanks,
> > Shreyas
> > 
> > On Monday 25 August 2014 11:31 PM, Shreyas B. Prabhu wrote:
> >> Fast sleep is an idle state, where the core and the L1 and L2
> >> caches are brought down to a threshold voltage. This also means that
> >> the communication between L2 and L3 caches have to be fenced. However
> >> the current P8 chips have a bug wherein this fencing between L2 and
> >> L3 caches get delayed by a cpu cycle. This can delay L3 response to
> >> the other cpus if they request for data during this time. Thus they
> >> would fetch the same data from the memory which could lead to data
> >> corruption if L3 cache is not flushed.
> >> Patch 4 adds support to work around this.
> >>
> >> 'Deep Winkle' is a deeper idle state where core and private L2 are powered
> >> off. While it offers higher power savings, it is at the cost of losing
> >> hypervisor register state and higher latency.
> >> Patch 5-9 adds support for winkle and uses it for offline cpus.
> >>
> >> Patch 1 - Moves parameters required discover idle states to a location 
> >> common to both cpuidle driver and powernv core code
> >> Patch 2 - Populates idle state details from device tree
> >> Patch 3 - Enables cpus to run guest after waking up from fastsleep/winkle
> >>
> >>
> >> Cc: Benjamin Herrenschmidt 
> >> Cc: Paul Mackerras 
> >> Cc: Michael Ellerman 
> >> Cc: Rafael J. Wysocki 
> >> Cc: Srivatsa S. Bhat 
> >> Cc: Preeti U. Murthy 
> >> Cc: Vaidyanathan Srinivasan 
> >> Cc: Rob Herring 
> >> Cc: Grant Likely 
> >> Cc: devicet...@vger.kernel.org
> >> Cc: linux...@vger.kernel.org
> >> Cc: linuxppc-dev@lists.ozlabs.org
> >>
> >> Preeti U Murthy (2):
> >>   cpuidle/powernv: Populate cpuidle state details by querying the
> >> device-tree
> >>   powerpc/powernv/cpuidle: Add workaround to enable fastsleep
> >>
> >> Shreyas B. Prabhu (6):
> >>   powerpc/kvm/book3s_hv: Enable CPUs to run guest after waking up from
> >> fast-sleep
> >>   powerpc/powernv: Add OPAL call to save and restore
> >>   powerpc: Adding macro for accessing Thread Switch Control Register
> >>   powerpc/powernv: Add winkle infrastructure
> >>   powerpc/powernv: Discover and enable winkle
> >>   powerpc/powernv: Enter deepest supported idle state in offline
> >>
> >> Srivatsa S. Bhat (1):
> >>   powerpc/powernv: Enable Offline CPUs to enter deep idle states
> >>
> >>  arch/powerpc/include/asm/machdep.h |   4 +
> >>  arch/powerpc/include/asm/opal.h|  10 ++
> >>  arch/powerpc/include/asm/paca.h|   3 +
> >>  arch/powerpc/include/asm/ppc-opcode.h  |   2 +
> >>  arch/powerpc/include/asm/processor.h   |   6 +-
> >>  arch/powerpc/include/asm/reg.h |   1 +
> >>  arch/powerpc/kernel/asm-offsets.c  |   1 +
> >>  arch/powerpc/kernel/exceptions-64s.S   |  37 ++---
> >>  arch/powerpc/kernel/idle.c |  30 
> >>  arch/powerpc/kernel/idle_power7.S  |  83 +-
> >>  arch/powerpc/platforms/powernv/opal-wrappers.S |   2 +
> >>  arch/powerpc/platforms/powernv/powernv.h   |   8 +
> >>  arch/powerpc/platforms/powernv/setup.c | 217 
> >> +
> >>  arch/powerpc/platforms/powernv/smp.c   |  13 +-
> >>  arch/powerpc/platforms/powernv/subcore.c   |  15 ++
> >>  drivers/cpuidle/cpuidle-powernv.c  |  40 -
> >>  16 files changed, 439 insertions(+), 33 deletions(-)
> >>
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/9] powerpc/powernv: Support for fastsleep and winkle

2014-09-30 Thread Rafael J. Wysocki
On Tuesday, September 30, 2014 01:42:05 PM Shreyas B Prabhu wrote:
> Hi Rafael,
> 
> On Tuesday 30 September 2014 04:58 AM, Rafael J. Wysocki wrote:
> > On Monday, September 29, 2014 03:53:06 PM Shreyas B Prabhu wrote:
> >> Hi,
> >> Any updates on this patch series?
> > 
> > I have a couple of patches from there in my tree it seems.  Please have a 
> > look
> > at linux-pm.git/linux-next and please let me know if that's the case.
> > 
> 
> I checked linux-pm.git/linux-net (Last commit 067c17382165). None of the 
> patches
> in this series are present in the tree. 

OK, thanks for checking.

Would you mind resending it with any ACKs or Reviewed-by tags received so far?


> >> On Thursday 18 September 2014 08:41 AM, Shreyas B Prabhu wrote:
> >>> Hi,
> >>>
> >>> In this patch series we use winkle for offlined cores. I successfully
> >>> tested the working of this with subcore functionality.
> >>>
> >>> Test scenario was as follows:
> >>> 1. Set SMT mode to 1, Set subores-per-core to 1
> >>> 2. Offline a core, in this case cpu 32 (sending it to winkle)
> >>> 3. Set subcores-per-core to 4
> >>> 4. Online the core
> >>> 5. Start a guest (Topology 1 core 2 threads) on a subcore, in this case
> >>> on cpu 36
> >>>
> >>> This works without any glitch.
> >>>
> >>> Thanks,
> >>> Shreyas
> >>>
> >>> On Monday 25 August 2014 11:31 PM, Shreyas B. Prabhu wrote:
> >>>> Fast sleep is an idle state, where the core and the L1 and L2
> >>>> caches are brought down to a threshold voltage. This also means that
> >>>> the communication between L2 and L3 caches have to be fenced. However
> >>>> the current P8 chips have a bug wherein this fencing between L2 and
> >>>> L3 caches get delayed by a cpu cycle. This can delay L3 response to
> >>>> the other cpus if they request for data during this time. Thus they
> >>>> would fetch the same data from the memory which could lead to data
> >>>> corruption if L3 cache is not flushed.
> >>>> Patch 4 adds support to work around this.
> >>>>
> >>>> 'Deep Winkle' is a deeper idle state where core and private L2 are 
> >>>> powered
> >>>> off. While it offers higher power savings, it is at the cost of losing
> >>>> hypervisor register state and higher latency.
> >>>> Patch 5-9 adds support for winkle and uses it for offline cpus.
> >>>>
> >>>> Patch 1 - Moves parameters required discover idle states to a location 
> >>>> common to both cpuidle driver and powernv core code
> >>>> Patch 2 - Populates idle state details from device tree
> >>>> Patch 3 - Enables cpus to run guest after waking up from fastsleep/winkle
> >>>>
> >>>>
> >>>> Cc: Benjamin Herrenschmidt 
> >>>> Cc: Paul Mackerras 
> >>>> Cc: Michael Ellerman 
> >>>> Cc: Rafael J. Wysocki 
> >>>> Cc: Srivatsa S. Bhat 
> >>>> Cc: Preeti U. Murthy 
> >>>> Cc: Vaidyanathan Srinivasan 
> >>>> Cc: Rob Herring 
> >>>> Cc: Grant Likely 
> >>>> Cc: devicet...@vger.kernel.org
> >>>> Cc: linux...@vger.kernel.org
> >>>> Cc: linuxppc-dev@lists.ozlabs.org
> >>>>
> >>>> Preeti U Murthy (2):
> >>>>   cpuidle/powernv: Populate cpuidle state details by querying the
> >>>> device-tree
> >>>>   powerpc/powernv/cpuidle: Add workaround to enable fastsleep
> >>>>
> >>>> Shreyas B. Prabhu (6):
> >>>>   powerpc/kvm/book3s_hv: Enable CPUs to run guest after waking up from
> >>>> fast-sleep
> >>>>   powerpc/powernv: Add OPAL call to save and restore
> >>>>   powerpc: Adding macro for accessing Thread Switch Control Register
> >>>>   powerpc/powernv: Add winkle infrastructure
> >>>>   powerpc/powernv: Discover and enable winkle
> >>>>   powerpc/powernv: Enter deepest supported idle state in offline
> >>>>
> >>>> Srivatsa S. Bhat (1):
> >>>>   powerpc/powernv: Enable Offline CPUs to enter deep idle states
> >>>>
> >>>>  arch/powerpc/include/asm/machdep.h |   4 +
> >>>>  arch/powerpc/include/asm/opal.h|  10

Re: [PATCH v2 0/3] powernv/cpuidle: Fastsleep workaround and fixes

2014-10-01 Thread Rafael J. Wysocki
On Wednesday, October 01, 2014 01:15:57 PM Shreyas B. Prabhu wrote:
> Fast sleep is an idle state, where the core and the L1 and L2
> caches are brought down to a threshold voltage. This also means that
> the communication between L2 and L3 caches have to be fenced. However
> the current P8 chips have a bug wherein this fencing between L2 and
> L3 caches get delayed by a cpu cycle. This can delay L3 response to
> the other cpus if they request for data during this time. Thus they
> would fetch the same data from the memory which could lead to data
> corruption if L3 cache is not flushed. 
> 
> This series overcomes above problem in kernel.
> 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Rafael J. Wysocki 
> Cc: linux...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Srivatsa S. Bhat 
> Cc: Preeti U. Murthy 
> Cc: Vaidyanathan Srinivasan 
> 
> v2:
> Rebased on 3.17-rc7
> Split from 'powerpc/powernv: Support for fastsleep and winkle'
> 
> v1:
> https://lkml.org/lkml/2014/8/25/446
> 
> Preeti U Murthy (1):
>   powerpc/powernv/cpuidle: Add workaround to enable fastsleep
> 
> Shreyas B. Prabhu (1):
>   powerpc/kvm/book3s_hv: Enable CPUs to run guest after waking up from
> fast-sleep
> 
> Srivatsa S. Bhat (1):
>   powerpc/powernv: Enable Offline CPUs to enter deep idle states
> 
>  arch/powerpc/include/asm/machdep.h |   3 +
>  arch/powerpc/include/asm/opal.h|   7 ++
>  arch/powerpc/include/asm/processor.h   |   4 +-
>  arch/powerpc/kernel/exceptions-64s.S   |  35 
>  arch/powerpc/kernel/idle.c |  19 
>  arch/powerpc/kernel/idle_power7.S  |   2 +-
>  arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
>  arch/powerpc/platforms/powernv/powernv.h   |   7 ++
>  arch/powerpc/platforms/powernv/setup.c | 118 
> +
>  arch/powerpc/platforms/powernv/smp.c   |  11 ++-
>  drivers/cpuidle/cpuidle-powernv.c  |  13 ++-
>  11 files changed, 194 insertions(+), 26 deletions(-)

[2/3] seems to be missig from the series.

Also, since that mostly modifies arch/powerpc, I think it should go through
that tree.  I'm fine with the cpuidle-powernv changes in [1/3] and [3/3].

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 03/44] hibernate: Call have_kernel_poweroff instead of checking pm_power_off

2014-10-07 Thread Rafael J. Wysocki
On Monday, October 06, 2014 10:28:05 PM Guenter Roeck wrote:
> Poweroff handlers may now be installed with register_poweroff_handler.
> Use the new API function have_kernel_poweroff to determine if a poweroff
> handler has been installed.
> 
> Cc: Rafael J. Wysocki 
> Cc: Pavel Machek 
> Cc: Len Brown 
> Signed-off-by: Guenter Roeck 

ACK

> ---
>  kernel/power/hibernate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index a9dfa79..20353c5 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -602,7 +602,7 @@ static void power_down(void)
>   case HIBERNATION_PLATFORM:
>   hibernation_platform_enter();
>   case HIBERNATION_SHUTDOWN:
> - if (pm_power_off)
> + if (have_kernel_poweroff())
>   kernel_power_off();
>   break;
>  #ifdef CONFIG_SUSPEND
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 44/44] kernel: Remove pm_power_off

2014-10-07 Thread Rafael J. Wysocki
On Monday, October 06, 2014 10:28:46 PM Guenter Roeck wrote:
> No users of pm_power_off are left, so it is safe to remove the function.
> 
> Cc: Rafael J. Wysocki 
> Cc: Pavel Machek 
> Cc: Len Brown 
> Signed-off-by: Guenter Roeck 

ACK

> ---
>  include/linux/pm.h  |  1 -
>  kernel/power/poweroff_handler.c | 10 +-
>  2 files changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index 45271b5..fce7645 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -31,7 +31,6 @@
>  /*
>   * Callbacks for platform drivers to implement.
>   */
> -extern void (*pm_power_off)(void);
>  extern void (*pm_power_off_prepare)(void);
>  
>  /*
> diff --git a/kernel/power/poweroff_handler.c b/kernel/power/poweroff_handler.c
> index 96f59ef..01a3a39 100644
> --- a/kernel/power/poweroff_handler.c
> +++ b/kernel/power/poweroff_handler.c
> @@ -20,12 +20,6 @@
>  #include 
>  
>  /*
> - * If set, calling this function will power off the system immediately.
> - */
> -void (*pm_power_off)(void);
> -EXPORT_SYMBOL(pm_power_off);
> -
> -/*
>   *   Notifier list for kernel code which wants to be called
>   *   to power off the system.
>   */
> @@ -163,8 +157,6 @@ int register_poweroff_handler_simple(void 
> (*handler)(void), int priority)
>   */
>  void do_kernel_poweroff(void)
>  {
> - if (pm_power_off)
> - pm_power_off();
>   atomic_notifier_call_chain(&poweroff_handler_list, 0, NULL);
>  }
>  
> @@ -175,6 +167,6 @@ void do_kernel_poweroff(void)
>   */
>  bool have_kernel_poweroff(void)
>  {
> - return pm_power_off != NULL || poweroff_handler_list.head != NULL;
> + return poweroff_handler_list.head != NULL;
>  }
>  EXPORT_SYMBOL(have_kernel_poweroff);
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 08/44] kernel: Move pm_power_off to common code

2014-10-07 Thread Rafael J. Wysocki
On Monday, October 06, 2014 10:28:10 PM Guenter Roeck wrote:
> pm_power_off is defined for all architectures. Move it to common code.
> 
> Have all architectures call do_kernel_poweroff instead of pm_power_off.
> Some architectures point pm_power_off to machine_power_off. For those,
> call do_kernel_poweroff from machine_power_off instead.

ACK

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] cpuidle/powernv: Populate cpuidle state details by querying the device-tree

2014-10-21 Thread Rafael J. Wysocki
On Tuesday, October 14, 2014 01:23:00 PM Preeti U Murthy wrote:
> We hard code the metrics relevant for cpuidle states in the kernel today.
> Instead pick them up from the device tree so that they remain relevant
> and updated for the system that the kernel is running on.
> 
> Cc: linux...@vger.kernel.org
> Cc: Rafael J. Wysocki 
> Cc: devicet...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Signed-off-by: Preeti U. Murthy 
> Signed-off-by: Shreyas B. Prabhu 

Applied, thanks!

> ---
> 
>  drivers/cpuidle/cpuidle-powernv.c |   27 ++-
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle-powernv.c 
> b/drivers/cpuidle/cpuidle-powernv.c
> index fa79392..b57681d 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -165,7 +165,8 @@ static int powernv_add_idle_states(void)
>   int nr_idle_states = 1; /* Snooze */
>   int dt_idle_states;
>   const __be32 *idle_state_flags;
> - u32 len_flags, flags;
> + const __be32 *idle_state_latency;
> + u32 len_flags, flags, latency_ns;
>   int i;
>  
>   /* Currently we have snooze statically defined */
> @@ -182,18 +183,32 @@ static int powernv_add_idle_states(void)
>   return nr_idle_states;
>   }
>  
> + idle_state_latency = of_get_property(power_mgt,
> + "ibm,cpu-idle-state-latencies-ns", NULL);
> + if (!idle_state_latency) {
> + pr_warn("DT-PowerMgmt: missing 
> ibm,cpu-idle-state-latencies-ns\n");
> + return nr_idle_states;
> + }
> +
>   dt_idle_states = len_flags / sizeof(u32);
>  
>   for (i = 0; i < dt_idle_states; i++) {
>  
>   flags = be32_to_cpu(idle_state_flags[i]);
> +
> + /* Cpuidle accepts exit_latency in us and we estimate
> +  * target residency to be 10x exit_latency
> +  */
> + latency_ns = be32_to_cpu(idle_state_latency[i]);
>   if (flags & IDLE_USE_INST_NAP) {
>   /* Add NAP state */
>   strcpy(powernv_states[nr_idle_states].name, "Nap");
>   strcpy(powernv_states[nr_idle_states].desc, "Nap");
>   powernv_states[nr_idle_states].flags = 
> CPUIDLE_FLAG_TIME_VALID;
> - powernv_states[nr_idle_states].exit_latency = 10;
> - powernv_states[nr_idle_states].target_residency = 100;
> + powernv_states[nr_idle_states].exit_latency =
> + ((unsigned int)latency_ns) / 1000;
> + powernv_states[nr_idle_states].target_residency =
> + ((unsigned int)latency_ns / 100);
>   powernv_states[nr_idle_states].enter = &nap_loop;
>   nr_idle_states++;
>   }
> @@ -204,8 +219,10 @@ static int powernv_add_idle_states(void)
>   strcpy(powernv_states[nr_idle_states].desc, 
> "FastSleep");
>   powernv_states[nr_idle_states].flags =
>   CPUIDLE_FLAG_TIME_VALID | 
> CPUIDLE_FLAG_TIMER_STOP;
> - powernv_states[nr_idle_states].exit_latency = 300;
> - powernv_states[nr_idle_states].target_residency = 
> 100;
> + powernv_states[nr_idle_states].exit_latency =
> + ((unsigned int)latency_ns) / 1000;
> + powernv_states[nr_idle_states].target_residency =
> + ((unsigned int)latency_ns / 100);
>   powernv_states[nr_idle_states].enter = &fastsleep_loop;
>   nr_idle_states++;
>   }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5] cpufreq: powerpc: Add cpufreq driver for Freescale e500mc SoCs

2013-06-05 Thread Rafael J. Wysocki
On Wednesday, June 05, 2013 04:06:29 PM Viresh Kumar wrote:
> On 5 June 2013 15:00,   wrote:
> > From: Tang Yuantian 
> >
> > Add cpufreq driver for Freescale e500mc, e5500 and e6500 SoCs
> > which are capable of changing the CPU frequency dynamically
> >
> > Signed-off-by: Tang Yuantian 
> > Signed-off-by: Li Yang 
> > ---
> > v5:
> > - enhance the CPU hotplug case
> > - mask the disallowed CPU frequencies
> > - remove freqs.cpu = policy->cpu;
> > - refine the code style
> > v4:
> > - rebase on bleeding-edge branch of Rafael's linux-pm.git
> > - #define pr_fmt() for better debug prints
> > - use newest cpufreq_notify_transition()
> > - support CPU hotplug
> > - remove table[i].index as it is not used
> > - remove cpus_per_cluster
> > v3:
> > - change sizeof(struct name).. to sizeof(*p)
> > - remove the struct cpufreq_data, use global variable instead
> > - resolve setting policy->cpus incorrectly
> > - add CPUFREQ_POSTCHANGE notifier when setting frequency error
> > v2:
> > - add depends on OF and COMMON_CLK in Kconfig
> > - use clk.h instead of clk-provider.h
> > - change per_cpu variable from struct to pointer
> >
> >  drivers/cpufreq/Kconfig.powerpc   |  10 +
> >  drivers/cpufreq/Makefile  |   1 +
> >  drivers/cpufreq/ppc-corenet-cpufreq.c | 380 
> > ++
> >  3 files changed, 391 insertions(+)
> >  create mode 100644 drivers/cpufreq/ppc-corenet-cpufreq.c
> 
> I haven't gone for a very deep review this time and it looked okay.
> 
> Acked-by: Viresh Kumar 

OK, queued up for 3.11.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 17/18] cpufreq: powerpc: move cpufreq driver to drivers/cpufreq

2013-06-07 Thread Rafael J. Wysocki
On Friday, June 07, 2013 10:48:21 AM Viresh Kumar wrote:
> On 31 May 2013 16:20, Viresh Kumar  wrote:
> > On 20 May 2013 10:10, Viresh Kumar  wrote:
> >> On 13 May 2013 11:34, Viresh Kumar  wrote:
> >>> On 22 April 2013 12:19, Viresh Kumar  wrote:
> >>>> On 9 April 2013 14:05, Viresh Kumar  wrote:
> >>>>> On 5 April 2013 12:16, Viresh Kumar  wrote:
> >>>>>> On 4 April 2013 18:24, Viresh Kumar  wrote:
> >>>>>>> This patch moves cpufreq driver of powerpc platform to 
> >>>>>>> drivers/cpufreq.
> >>>>>>>
> >>>>>>> Cc: Benjamin Herrenschmidt 
> >>>>>>> Cc: Paul Mackerras 
> >>>>>>> Cc: Olof Johansson 
> >>>>>>> Cc: linuxppc-dev@lists.ozlabs.org
> >>>>>>> Signed-off-by: Viresh Kumar 
> >>>>>>> ---
> >>>>>>> Compile Tested only.
> >>>>>>>
> >>>>>>>  arch/powerpc/platforms/Kconfig | 31 
> >>>>>>> --
> >>>>>>>  arch/powerpc/platforms/pasemi/Makefile |  1 -
> >>>>>>>  arch/powerpc/platforms/powermac/Makefile   |  2 --
> >>>>>>>  drivers/cpufreq/Kconfig.powerpc| 26 
> >>>>>>> ++
> >>>>>>>  drivers/cpufreq/Makefile   |  3 +++
> >>>>>>>  .../cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c  |  0
> >>>>>>>  .../cpufreq/pmac32-cpufreq.c   |  0
> >>>>>>>  .../cpufreq/pmac64-cpufreq.c   |  0
> >>>>>>>  8 files changed, 29 insertions(+), 34 deletions(-)
> >>>>>>>  rename arch/powerpc/platforms/pasemi/cpufreq.c => 
> >>>>>>> drivers/cpufreq/pasemi-cpufreq.c (100%)
> >>>>>>>  rename arch/powerpc/platforms/powermac/cpufreq_32.c => 
> >>>>>>> drivers/cpufreq/pmac32-cpufreq.c (100%)
> >>>>>>>  rename arch/powerpc/platforms/powermac/cpufreq_64.c => 
> >>>>>>> drivers/cpufreq/pmac64-cpufreq.c (100%)
> >>>>>>
> >>>>>> Hi Deepthi,
> >>>>>>
> >>>>>> Can you help testing this please?
> >>>>>
> >>>>> Ping!!
> >>>>
> >>>> Ping!!
> >>>
> >>> Hi Benjamin,
> >>>
> >>> Hope you are back from your vacations. Can you give it a try now?
> >>
> >> Ping!!
> >
> > Ping!!
> 
> Hi Rafael,
> 
> Its been more than 2 months now that this patch was first posted.
> And the response from Maintainers isn't so great, irrespective of
> how many times I pinged them.
> 
> This is what I think:
> - It looked functionally correct to Benjamin but he wanted somebody
> to actually test it.
> - Arnd gave his Ack (So it looked functionally correct to him too)
> - We can probably push this into linux-next now and see if somebody
> complains of any breakage it has done. If not we can get it pushed for
> 3.11.

I agree, applied to bleeding-edge.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3] cpufreq: pmac64: speed up frequency switch

2013-07-23 Thread Rafael J. Wysocki
On Tuesday, July 23, 2013 11:24:37 PM Aaro Koskinen wrote:
> Some functions on switch path use msleep() which is inaccurate, and
> depends on HZ. With HZ=100 msleep(1) takes actually over ten times longer.
> Using usleep_range() we get more accurate sleeps.
> 
> I measured the "pfunc_slewing_done" polling to take 300us at max (on
> 2.3GHz dual-processor Xserve G5), so using 500us sleep there should
> be fine.
> 
> With the patch, g5_switch_freq() duration drops from ~50ms to ~10ms on
> Xserve with HZ=100.
> 
> Signed-off-by: Aaro Koskinen 

All looks good in the patchset from 1 feet (or more), but I need Ben to
speak here.

Thanks,
Rafael


> ---
>  drivers/cpufreq/pmac64-cpufreq.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/pmac64-cpufreq.c 
> b/drivers/cpufreq/pmac64-cpufreq.c
> index 7ba4234..674807d 100644
> --- a/drivers/cpufreq/pmac64-cpufreq.c
> +++ b/drivers/cpufreq/pmac64-cpufreq.c
> @@ -141,7 +141,7 @@ static void g5_vdnap_switch_volt(int speed_mode)
>   pmf_call_one(pfunc_vdnap0_complete, &args);
>   if (done)
>   break;
> - msleep(1);
> + usleep_range(1000, 1000);
>   }
>   if (done == 0)
>   printk(KERN_WARNING "cpufreq: Timeout in clock slewing !\n");
> @@ -240,7 +240,7 @@ static void g5_pfunc_switch_volt(int speed_mode)
>   if (pfunc_cpu1_volt_low)
>   pmf_call_one(pfunc_cpu1_volt_low, NULL);
>   }
> - msleep(10); /* should be faster , to fix */
> + usleep_range(1, 1); /* should be faster , to fix */
>  }
>  
>  /*
> @@ -285,7 +285,7 @@ static int g5_pfunc_switch_freq(int speed_mode)
>   pmf_call_one(pfunc_slewing_done, &args);
>   if (done)
>   break;
> - msleep(1);
> + usleep_range(500, 500);
>   }
>   if (done == 0)
>   printk(KERN_WARNING "cpufreq: Timeout in clock slewing !\n");
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] cpuidle: export cpuidle_idle_call symbol

2013-07-23 Thread Rafael J. Wysocki
On Tuesday, July 23, 2013 05:28:01 PM Dongsheng Wang wrote:
> From: Wang Dongsheng 
> 
> Export cpuidle_idle_call symbol, make this function can be invoked
> in the module.

Why?

Rafael


> Signed-off-by: Wang Dongsheng 
> ---
> Branch: pm-cpuidle
> 
>  drivers/cpuidle/cpuidle.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 534320a..d0a61d6 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -168,6 +168,7 @@ int cpuidle_idle_call(void)
>  
>   return 0;
>  }
> +EXPORT_SYMBOL_GPL(cpuidle_idle_call);
>  
>  /**
>   * cpuidle_install_idle_handler - installs the cpuidle idle loop handler
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2] cpuidle: fix cpu idle driver as a module can not remove

2013-07-23 Thread Rafael J. Wysocki
On Tuesday, July 23, 2013 05:28:00 PM Dongsheng Wang wrote:
> From: Wang Dongsheng 
> 
> The module can not be removed when execute "rmmod". rmmod not use
> "--force".
> 
> Log:
> root:~# rmmod cpuidle-e500
> incs[9], decs[1]
> rmmod: can't unload 'cpuidle_e500': Resource temporarily unavailable
> 
> Signed-off-by: Wang Dongsheng 

Can you please check the current linux-next branch of the linux-pm.git tree
and see if that doesn't conflict with the material in there?

Also please explain in the changelog how your changes help to fix the problem.

Thanks,
Rafael


> ---
> Branch: pm-cpuidle
> 
>  drivers/cpuidle/cpuidle.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index fdc432f..534320a 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -386,6 +386,9 @@ static int __cpuidle_register_device(struct 
> cpuidle_device *dev)
>   goto err_coupled;
>  
>   dev->registered = 1;
> +
> + module_put(drv->owner);
> +
>   return 0;
>  
>  err_coupled:
> @@ -432,8 +435,6 @@ EXPORT_SYMBOL_GPL(cpuidle_register_device);
>   */
>  void cpuidle_unregister_device(struct cpuidle_device *dev)
>  {
> - struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
> -
>   if (dev->registered == 0)
>   return;
>  
> @@ -448,8 +449,6 @@ void cpuidle_unregister_device(struct cpuidle_device *dev)
>   cpuidle_coupled_unregister_device(dev);
>  
>   cpuidle_resume_and_unlock();
> -
> - module_put(drv->owner);
>  }
>  
>  EXPORT_SYMBOL_GPL(cpuidle_unregister_device);
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2] PCI: hotplug: Convert to be builtin only, not modular

2013-07-25 Thread Rafael J. Wysocki
On Thursday, July 25, 2013 11:57:20 AM Bjorn Helgaas wrote:
> Convert CONFIG_HOTPLUG_PCI from tristate to bool.  This only affects
> the hotplug core; several of the hotplug drivers can still be modules.
> 
> Signed-off-by: Bjorn Helgaas 

Acked-by: Rafael J. Wysocki 

> ---
>  arch/ia64/configs/generic_defconfig|2 +-
>  arch/ia64/configs/gensparse_defconfig  |2 +-
>  arch/ia64/configs/tiger_defconfig  |2 +-
>  arch/ia64/configs/xen_domu_defconfig   |2 +-
>  arch/powerpc/configs/ppc64_defconfig   |2 +-
>  arch/powerpc/configs/ppc64e_defconfig  |2 +-
>  arch/powerpc/configs/pseries_defconfig |2 +-
>  arch/sh/configs/sh03_defconfig |2 +-
>  drivers/pci/hotplug/Kconfig|5 +
>  9 files changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/ia64/configs/generic_defconfig 
> b/arch/ia64/configs/generic_defconfig
> index 7913695..efbd292 100644
> --- a/arch/ia64/configs/generic_defconfig
> +++ b/arch/ia64/configs/generic_defconfig
> @@ -31,7 +31,7 @@ CONFIG_ACPI_FAN=m
>  CONFIG_ACPI_DOCK=y
>  CONFIG_ACPI_PROCESSOR=m
>  CONFIG_ACPI_CONTAINER=m
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_HOTPLUG_PCI_ACPI=m
>  CONFIG_PACKET=y
>  CONFIG_UNIX=y
> diff --git a/arch/ia64/configs/gensparse_defconfig 
> b/arch/ia64/configs/gensparse_defconfig
> index f8e9133..f64980d 100644
> --- a/arch/ia64/configs/gensparse_defconfig
> +++ b/arch/ia64/configs/gensparse_defconfig
> @@ -25,7 +25,7 @@ CONFIG_ACPI_BUTTON=m
>  CONFIG_ACPI_FAN=m
>  CONFIG_ACPI_PROCESSOR=m
>  CONFIG_ACPI_CONTAINER=m
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_HOTPLUG_PCI_ACPI=m
>  CONFIG_PACKET=y
>  CONFIG_UNIX=y
> diff --git a/arch/ia64/configs/tiger_defconfig 
> b/arch/ia64/configs/tiger_defconfig
> index a5a9e02..0f4e9e4 100644
> --- a/arch/ia64/configs/tiger_defconfig
> +++ b/arch/ia64/configs/tiger_defconfig
> @@ -31,7 +31,7 @@ CONFIG_ACPI_BUTTON=m
>  CONFIG_ACPI_FAN=m
>  CONFIG_ACPI_PROCESSOR=m
>  CONFIG_ACPI_CONTAINER=m
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_HOTPLUG_PCI_ACPI=m
>  CONFIG_PACKET=y
>  CONFIG_UNIX=y
> diff --git a/arch/ia64/configs/xen_domu_defconfig 
> b/arch/ia64/configs/xen_domu_defconfig
> index 37b9b42..b025acf 100644
> --- a/arch/ia64/configs/xen_domu_defconfig
> +++ b/arch/ia64/configs/xen_domu_defconfig
> @@ -32,7 +32,7 @@ CONFIG_ACPI_BUTTON=m
>  CONFIG_ACPI_FAN=m
>  CONFIG_ACPI_PROCESSOR=m
>  CONFIG_ACPI_CONTAINER=m
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_HOTPLUG_PCI_ACPI=m
>  CONFIG_PACKET=y
>  CONFIG_UNIX=y
> diff --git a/arch/powerpc/configs/ppc64_defconfig 
> b/arch/powerpc/configs/ppc64_defconfig
> index c86fcb9..0e8cfd0 100644
> --- a/arch/powerpc/configs/ppc64_defconfig
> +++ b/arch/powerpc/configs/ppc64_defconfig
> @@ -58,7 +58,7 @@ CONFIG_SCHED_SMT=y
>  CONFIG_PPC_DENORMALISATION=y
>  CONFIG_PCCARD=y
>  CONFIG_ELECTRA_CF=y
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_HOTPLUG_PCI_RPA=m
>  CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
>  CONFIG_PACKET=y
> diff --git a/arch/powerpc/configs/ppc64e_defconfig 
> b/arch/powerpc/configs/ppc64e_defconfig
> index 4b20f76..0085dc4 100644
> --- a/arch/powerpc/configs/ppc64e_defconfig
> +++ b/arch/powerpc/configs/ppc64e_defconfig
> @@ -32,7 +32,7 @@ CONFIG_IRQ_ALL_CPUS=y
>  CONFIG_SPARSEMEM_MANUAL=y
>  CONFIG_PCI_MSI=y
>  CONFIG_PCCARD=y
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_PACKET=y
>  CONFIG_UNIX=y
>  CONFIG_XFRM_USER=m
> diff --git a/arch/powerpc/configs/pseries_defconfig 
> b/arch/powerpc/configs/pseries_defconfig
> index bea8587..1d4b976 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -53,7 +53,7 @@ CONFIG_PPC_64K_PAGES=y
>  CONFIG_PPC_SUBPAGE_PROT=y
>  CONFIG_SCHED_SMT=y
>  CONFIG_PPC_DENORMALISATION=y
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_HOTPLUG_PCI_RPA=m
>  CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
>  CONFIG_PACKET=y
> diff --git a/arch/sh/configs/sh03_defconfig b/arch/sh/configs/sh03_defconfig
> index 2051821..0cf4097 100644
> --- a/arch/sh/configs/sh03_defconfig
> +++ b/arch/sh/configs/sh03_defconfig
> @@ -22,7 +22,7 @@ CONFIG_PREEMPT=y
>  CONFIG_CMDLINE_OVERWRITE=y
>  CONFIG_CMDLINE="console=ttySC1,115200 mem=64M root=/dev/nfs"
>  CONFIG_PCI=y
> -CONFIG_HOTPLUG_PCI=m
> +CONFIG_HOTPLUG_PCI=y
>  CONFIG_BINFMT_MISC=y
>  CONFIG_NET=y
>  CONFIG_PACKET=y
> diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
> index bb7ebb2..d85009d 100644
> --- a/drivers/pci/hotplug/Kconfig
> +++ b/drivers/pci/hotplug/Kconfig
> @@ -3,16 +3,13 @@
>  #

Re: [PATCH 2/3] include: Convert ethernet mac address declarations to use ETH_ALEN

2013-07-29 Thread Rafael J. Wysocki
.h
> @@ -6,6 +6,7 @@
>  #define __LINUX_MV643XX_ETH_H
>  
>  #include 
> +#include 
>  
>  #define MV643XX_ETH_SHARED_NAME  "mv643xx_eth"
>  #define MV643XX_ETH_NAME "mv643xx_eth_port"
> @@ -48,7 +49,7 @@ struct mv643xx_eth_platform_data {
>* Use this MAC address if it is valid, overriding the
>* address that is already in the hardware.
>*/
> - u8  mac_addr[6];
> + u8  mac_addr[ETH_ALEN];
>  
>   /*
>* If speed is 0, autonegotiation is enabled.
> diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h
> index fc30571..6205eeb 100644
> --- a/include/linux/sh_eth.h
> +++ b/include/linux/sh_eth.h
> @@ -2,6 +2,7 @@
>  #define __ASM_SH_ETH_H__
>  
>  #include 
> +#include 
>  
>  enum {EDMAC_LITTLE_ENDIAN, EDMAC_BIG_ENDIAN};
>  enum {
> @@ -18,7 +19,7 @@ struct sh_eth_plat_data {
>   phy_interface_t phy_interface;
>   void (*set_mdio_gate)(void *addr);
>  
> - unsigned char mac_addr[6];
> + unsigned char mac_addr[ETH_ALEN];
>   unsigned no_ether_link:1;
>   unsigned ether_link_active_low:1;
>   unsigned needs_init:1;
> diff --git a/include/linux/smsc911x.h b/include/linux/smsc911x.h
> index 4dde70e..eec3efd 100644
> --- a/include/linux/smsc911x.h
> +++ b/include/linux/smsc911x.h
> @@ -22,6 +22,7 @@
>  #define __LINUX_SMSC911X_H__
>  
>  #include 
> +#include 
>  
>  /* platform_device configuration data, should be assigned to
>   * the platform_device's dev.platform_data */
> @@ -31,7 +32,7 @@ struct smsc911x_platform_config {
>   unsigned int flags;
>   unsigned int shift;
>   phy_interface_t phy_interface;
> - unsigned char mac[6];
> + unsigned char mac[ETH_ALEN];
>  };
>  
>  /* Constants for platform_device irq polarity configuration */
> diff --git a/include/linux/uwb/spec.h b/include/linux/uwb/spec.h
> index b52e44f..0df24bf 100644
> --- a/include/linux/uwb/spec.h
> +++ b/include/linux/uwb/spec.h
> @@ -32,6 +32,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #define i1480_FW 0x0303
>  /* #define i1480_FW 0x0302 */
> @@ -130,7 +131,7 @@ enum { UWB_DRP_BACKOFF_WIN_MAX = 16 };
>   * it is also used to define headers sent down and up the wire/radio).
>   */
>  struct uwb_mac_addr {
> - u8 data[6];
> + u8 data[ETH_ALEN];
>  } __attribute__((packed));
>  
>  
> @@ -568,7 +569,7 @@ struct uwb_rc_evt_confirm {
>  /* Device Address Management event. [WHCI] section 3.1.3.2. */
>  struct uwb_rc_evt_dev_addr_mgmt {
>   struct uwb_rceb rceb;
> - u8 baAddr[6];
> + u8 baAddr[ETH_ALEN];
>   u8 bResultCode;
>  } __attribute__((packed));
>  
> diff --git a/include/media/tveeprom.h b/include/media/tveeprom.h
> index 4a1191a..f7119ee 100644
> --- a/include/media/tveeprom.h
> +++ b/include/media/tveeprom.h
> @@ -12,6 +12,8 @@ enum tveeprom_audio_processor {
>   TVEEPROM_AUDPROC_OTHER,
>  };
>  
> +#include 
> +
>  struct tveeprom {
>   u32 has_radio;
>   /* If has_ir == 0, then it is unknown what the IR capabilities are,
> @@ -40,7 +42,7 @@ struct tveeprom {
>   u32 revision;
>   u32 serial_number;
>   char rev_str[5];
> - u8 MAC_address[6];
> + u8 MAC_address[ETH_ALEN];
>  };
>  
>  void tveeprom_hauppauge_analog(struct i2c_client *c, struct tveeprom *tvee,
> diff --git a/include/net/irda/irlan_common.h b/include/net/irda/irlan_common.h
> index 0af8b8d..550c2d6 100644
> --- a/include/net/irda/irlan_common.h
> +++ b/include/net/irda/irlan_common.h
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -161,7 +162,7 @@ struct irlan_provider_cb {
>   int access_type; /* Access type */
>   __u16 send_arb_val;
>  
> - __u8 mac_address[6]; /* Generated MAC address for peer device */
> + __u8 mac_address[ETH_ALEN]; /* Generated MAC address for peer device */
>  };
>  
>  /*
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/3] include: Convert ethernet mac address declarations to use ETH_ALEN

2013-07-29 Thread Rafael J. Wysocki
On Monday, July 29, 2013 12:34:24 PM Joe Perches wrote:
> On Mon, 2013-07-29 at 13:59 +0200, Rafael J. Wysocki wrote:
> > On Sunday, July 28, 2013 10:29:04 PM Joe Perches wrote:
> > > It's convenient to have ethernet mac addresses use
> > > ETH_ALEN to be able to grep for them a bit easier and
> > > also to ensure that the addresses are __aligned(2).
> []
> > > diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
> []
> > > @@ -44,6 +44,8 @@
> []
> > > +#include 
> > > +
> []
> > > @@ -605,7 +607,7 @@ struct acpi_ibft_nic {
> []
> > > - u8 mac_address[6];
> > > + u8 mac_address[ETH_ALEN];
> 
> > Please don't touch this file.
> > 
> > It comes from a code base outside of the kernel and should be kept in sync 
> > with
> > the upstream.
> 
> Which files in include/acpi have this characteristic?

Generally, all whose names start with "ac" except for acpi_bus.h,
acpi_drivers.h and acpi_numa.h.

> Perhaps an include/acpi/README is appropriate.

Yes, we can add one.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] cpuidle: fix unremovable issue for module driver

2013-07-30 Thread Rafael J. Wysocki
On Tuesday, July 30, 2013 01:19:46 PM Daniel Lezcano wrote:
> On 07/30/2013 12:48 PM, Wang Dongsheng-B40534 wrote:
> > 
> > 
> >> -Original Message-
> >> From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org]
> >> Sent: Tuesday, July 30, 2013 5:28 PM
> >> To: Wang Dongsheng-B40534
> >> Cc: r...@sisk.pl; linux...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
> >> Subject: Re: [PATCH] cpuidle: fix unremovable issue for module driver
> >>
> >> On 07/30/2013 08:55 AM, Dongsheng Wang wrote:
> >>> From: Wang Dongsheng 
> >>>
> >>> After __cpuidle_register_device, the cpu incs are added up, but decs
> >>> are not, thus the module refcount is not match. So the module "exit"
> >>> function can not be executed when we do remove operation. Move
> >>> module_put into __cpuidle_register_device to fix it.
> >>
> >> Sorry, I still don't get it :/
> >>
> >> register->module_get
> >> unregister->module_put
> >>
> >> you change it by:
> >>
> >> register->module_get
> >> register->module_put
> >> unregister->none
> >>
> >> which is wrong.
> >>
> > module_get->set per cpu incs=1
> > module_put->set per cpu decs=1
> > 
> > module_refcount-> incs - decs;
> > 
> > "unregister" usually call in module->exit function. 
> > But if module_refcount is not zero, the module->exit() cannot be executed.
> 
> Ok, IIUC, the refcount is decremented in the unregister function but
> this one is never called because the refcount is not zero.
> 
> Funny, that means the module format was *never* used at all as it does
> not work.
> 
> I am wondering if we shouldn't just remove the module support for
> cpuidle. Rafael ?

That would be the simplest thing to do and possibly the most correct one too,
but I need to double check how inte_idle/ACPI idle interactions depend on that.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] cpuidle: fix unremovable issue for module driver

2013-07-31 Thread Rafael J. Wysocki
On Tuesday, July 30, 2013 03:33:44 PM Rafael J. Wysocki wrote:
> On Tuesday, July 30, 2013 01:19:46 PM Daniel Lezcano wrote:
> > On 07/30/2013 12:48 PM, Wang Dongsheng-B40534 wrote:
> > > 
> > > 
> > >> -Original Message-
> > >> From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org]
> > >> Sent: Tuesday, July 30, 2013 5:28 PM
> > >> To: Wang Dongsheng-B40534
> > >> Cc: r...@sisk.pl; linux...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
> > >> Subject: Re: [PATCH] cpuidle: fix unremovable issue for module driver
> > >>
> > >> On 07/30/2013 08:55 AM, Dongsheng Wang wrote:
> > >>> From: Wang Dongsheng 
> > >>>
> > >>> After __cpuidle_register_device, the cpu incs are added up, but decs
> > >>> are not, thus the module refcount is not match. So the module "exit"
> > >>> function can not be executed when we do remove operation. Move
> > >>> module_put into __cpuidle_register_device to fix it.
> > >>
> > >> Sorry, I still don't get it :/
> > >>
> > >> register->module_get
> > >> unregister->module_put
> > >>
> > >> you change it by:
> > >>
> > >> register->module_get
> > >> register->module_put
> > >> unregister->none
> > >>
> > >> which is wrong.
> > >>
> > > module_get->set per cpu incs=1
> > > module_put->set per cpu decs=1
> > > 
> > > module_refcount-> incs - decs;
> > > 
> > > "unregister" usually call in module->exit function. 
> > > But if module_refcount is not zero, the module->exit() cannot be executed.
> > 
> > Ok, IIUC, the refcount is decremented in the unregister function but
> > this one is never called because the refcount is not zero.
> > 
> > Funny, that means the module format was *never* used at all as it does
> > not work.
> > 
> > I am wondering if we shouldn't just remove the module support for
> > cpuidle. Rafael ?
> 
> That would be the simplest thing to do and possibly the most correct one too,
> but I need to double check how inte_idle/ACPI idle interactions depend on 
> that.

As I thought, the ordering between intel_idle and the ACPI processor driver's
idle part depends on the latter being modular.

Also I'm not sure if the core is at fault here.  It evidently expects that
drivers will be registered before devices, so the driver module cannot be
released as long as there are any active devices.  This sounds logical from
the correctness viewpoint.

However, from the usability viewpoint it is not very convenient.  It would be
nicer if the driver could be unregistered and registered while devices are
there, but I'm afraid that would require some code reorganization.

The $subject patch isn't a correct change anyway in my opinion, because it
tries to kind of sidestep the core's assumptions.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] cpufreq: qoriq: optimize the CPU frequency switching time

2015-06-15 Thread Rafael J. Wysocki
On Thursday, June 04, 2015 12:06:36 PM Viresh Kumar wrote:
> On 04-06-15, 14:25, yuantian.t...@freescale.com wrote:
> > From: Tang Yuantian 
> > 
> > Each time the CPU switches its frequency, the clock nodes in
> > DTS are walked through to find proper clock source. This is
> > very time-consuming, for example, it is up to 500+ us on T4240.
> > Besides, switching time varies from clock to clock.
> > To optimize this, each input clock of CPU is buffered, so that
> > it can be picked up instantly when needed.
> > 
> > Since for each CPU each input clock is stored in a pointer
> > which takes 4 or 8 bytes memory and normally there are several
> > input clocks per CPU, that will not take much memory as well.
> 
> Not sure how it got included in this form in the first place. :)
> 
> > Signed-off-by: Tang Yuantian 
> > ---
> >  drivers/cpufreq/qoriq-cpufreq.c | 32 +---
> >  1 file changed, 21 insertions(+), 11 deletions(-)
> 
> Acked-by: Viresh Kumar 

Queued up for 4.2, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

2015-06-24 Thread Rafael J. Wysocki
On Wednesday, June 24, 2015 01:48:01 AM Preeti U Murthy wrote:
> On some archs, the local clockevent device stops in deep cpuidle states.
> The broadcast framework is used to wakeup cpus in these idle states, in
> which either an external clockevent device is used to send wakeup ipis
> or the hrtimer broadcast framework kicks in in the absence of such a
> device. One cpu is nominated as the broadcast cpu and this cpu sends
> wakeup ipis to sleeping cpus at the appropriate time. This is the
> implementation in the oneshot mode of broadcast.
> 
> In periodic mode of broadcast however, the presence of such cpuidle
> states results in the cpuidle driver calling tick_broadcast_enable()
> which shuts down the local clockevent devices of all the cpus and
> appoints the tick broadcast device as the clockevent device for each of
> them. This works on those archs where the tick broadcast device is a
> real clockevent device.  But on archs which depend on the hrtimer mode
> of broadcast, the tick broadcast device hapens to be a pseudo device.
> The consequence is that the local clockevent devices of all cpus are
> shutdown and the kernel hangs at boot time in periodic mode.
> 
> Let us thus not register the cpuidle states which have
> CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
> mode of broadcast in periodic mode. This patch takes care of doing this
> on powerpc. The cpus would not have entered into such deep cpuidle
> states in periodic mode on powerpc anyway. So there is no loss here.
> 
> Signed-off-by: Preeti U Murthy 

4.2 material I suppose?

> ---
>  drivers/cpuidle/cpuidle-powernv.c |   15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle-powernv.c 
> b/drivers/cpuidle/cpuidle-powernv.c
> index 5937207..3442764 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -60,6 +60,8 @@ static int nap_loop(struct cpuidle_device *dev,
>   return index;
>  }
>  
> +/* Register for fastsleep only in oneshot mode of broadcast */
> +#ifdef CONFIG_TICK_ONESHOT
>  static int fastsleep_loop(struct cpuidle_device *dev,
>   struct cpuidle_driver *drv,
>   int index)
> @@ -83,7 +85,7 @@ static int fastsleep_loop(struct cpuidle_device *dev,
>  
>   return index;
>  }
> -
> +#endif
>  /*
>   * States for dedicated partition case.
>   */
> @@ -209,7 +211,14 @@ static int powernv_add_idle_states(void)
>   powernv_states[nr_idle_states].flags = 0;
>   powernv_states[nr_idle_states].target_residency = 100;
>   powernv_states[nr_idle_states].enter = &nap_loop;
> - } else if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
> + }
> +
> + /*
> +  * All cpuidle states with CPUIDLE_FLAG_TIMER_STOP set must come
> +  * within this config dependency check.
> +  */
> +#ifdef CONFIG_TICK_ONESHOT
> + if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
>   flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) {
>   /* Add FASTSLEEP state */
>   strcpy(powernv_states[nr_idle_states].name, 
> "FastSleep");
> @@ -218,7 +227,7 @@ static int powernv_add_idle_states(void)
>   powernv_states[nr_idle_states].target_residency = 
> 30;
>   powernv_states[nr_idle_states].enter = &fastsleep_loop;
>   }
> -
> +#endif
>   powernv_states[nr_idle_states].exit_latency =
>   ((unsigned int)latency_ns[i]) / 1000;
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

2015-06-24 Thread Rafael J. Wysocki
On Thu, Jun 25, 2015 at 12:10 AM, Michael Ellerman
 wrote:
>
>
> On 24 June 2015 23:50:40 GMT+10:00, "Rafael J. Wysocki"  
> wrote:
>>On Wednesday, June 24, 2015 01:48:01 AM Preeti U Murthy wrote:
>>> On some archs, the local clockevent device stops in deep cpuidle
>>states.
>>> The broadcast framework is used to wakeup cpus in these idle states,
>>in
>>> which either an external clockevent device is used to send wakeup
>>ipis
>>> or the hrtimer broadcast framework kicks in in the absence of such a
>>> device. One cpu is nominated as the broadcast cpu and this cpu sends
>>> wakeup ipis to sleeping cpus at the appropriate time. This is the
>>> implementation in the oneshot mode of broadcast.
>>>
>>> In periodic mode of broadcast however, the presence of such cpuidle
>>> states results in the cpuidle driver calling tick_broadcast_enable()
>>> which shuts down the local clockevent devices of all the cpus and
>>> appoints the tick broadcast device as the clockevent device for each
>>of
>>> them. This works on those archs where the tick broadcast device is a
>>> real clockevent device.  But on archs which depend on the hrtimer
>>mode
>>> of broadcast, the tick broadcast device hapens to be a pseudo device.
>>> The consequence is that the local clockevent devices of all cpus are
>>> shutdown and the kernel hangs at boot time in periodic mode.
>>>
>>> Let us thus not register the cpuidle states which have
>>> CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the
>>hrtimer
>>> mode of broadcast in periodic mode. This patch takes care of doing
>>this
>>> on powerpc. The cpus would not have entered into such deep cpuidle
>>> states in periodic mode on powerpc anyway. So there is no loss here.
>>>
>>> Signed-off-by: Preeti U Murthy 
>>
>>4.2 material I suppose?
>
> Yes please, in fact it should be marked for stable too.

That can be done.  Which -stable series should it go into?

Rafael
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

2015-06-24 Thread Rafael J. Wysocki
On Thu, Jun 25, 2015 at 12:06 AM, Benjamin Herrenschmidt
 wrote:
> On Wed, 2015-06-24 at 15:50 +0200, Rafael J. Wysocki wrote:
>> 4.2 material I suppose?
>
> And stable. Without this, if you configure without TICK_ONESHOT, the
> machine will hang.

OK, which -stable?  All of them or any specific series?

Rafael
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

2015-06-26 Thread Rafael J. Wysocki
On Thursday, June 25, 2015 09:05:49 AM Preeti U Murthy wrote:
> On 06/25/2015 05:36 AM, Rafael J. Wysocki wrote:
> > On Thu, Jun 25, 2015 at 12:06 AM, Benjamin Herrenschmidt
> >  wrote:
> >> On Wed, 2015-06-24 at 15:50 +0200, Rafael J. Wysocki wrote:
> >>> 4.2 material I suppose?
> >>
> >> And stable. Without this, if you configure without TICK_ONESHOT, the
> >> machine will hang.
> > 
> > OK, which -stable?  All of them or any specific series?
> 
> This needs to go into stable/linux-3.19.y,
> stable/linux-4.0.y, stable/linux-4.1.y.

So essentially 3.19+.  OK, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [GIT PULL] DT/core: cpu_ofnode updates for v3.12

2013-08-13 Thread Rafael J. Wysocki
On Tuesday, August 13, 2013 01:44:23 PM Rob Herring wrote:
> On Tue, Aug 13, 2013 at 10:40 AM, Sudeep KarkadaNagesha
>  wrote:
> > Adding PowerPC list
> >
> > On 13/08/13 14:00, Rafael J. Wysocki wrote:
> >> On Monday, August 12, 2013 02:27:47 PM Sudeep KarkadaNagesha wrote:
> >>> The following changes since commit
> >>> d4e4ab86bcba5a72779c43dc1459f71fea3d89c8:
> >>>
> >>> Linux 3.11-rc5 (2013-08-11 18:04:20 -0700)
> >>>
> >>> are available in the git repository at:
> >>>
> >>> git://linux-arm.org/linux-skn.git cpu_of_node
> 
> [snip]
> 
> >> All error/warnings:
> >>
> >> warning: (MPC836x_RDK && MTD_NAND_FSL_ELBC && MTD_NAND_FSL_UPM)
> >> selects FSL_LBC which has unmet direct dependencies (FSL_SOC)
> >> warning: (MPC836x_RDK && MTD_NAND_FSL_ELBC && MTD_NAND_FSL_UPM)
> >> selects FSL_LBC which has unmet direct dependencies (FSL_SOC)
> >> In file included from arch/powerpc/include/asm/kvm_para.h:26:0, from
> >> include/uapi/linux/kvm_para.h:26, from include/linux/kvm_para.h:4,
> >> from include/linux/kvm_host.h:30, from
> >> arch/powerpc/kernel/asm-offsets.c:53:
> >> include/linux/of.h:269:28: error: conflicting types for
> >>'of_get_cpu_node'
> >> extern struct device_node *of_get_cpu_node(int cpu); ^ In file
> >> included from include/linux/of.h:139:0, from
> >> arch/powerpc/include/asm/kvm_para.h:26, from
> >> include/uapi/linux/kvm_para.h:26, from include/linux/kvm_para.h:4,
> >> from include/linux/kvm_host.h:30, from
> >> arch/powerpc/kernel/asm-offsets.c:53:
> >> arch/powerpc/include/asm/prom.h:47:21: note: previous declaration
> >> of 'of_get_cpu_node' was here
> >> struct device_node *of_get_cpu_node(int cpu, unsigned int *thread);
> >> ^ make[2]: *** [arch/powerpc/kernel/asm-offsets.s] Error 1 make[2]:
> >> Target `__build' not remade because of errors. make[1]: ***
> >> [prepare0] Error 2 make[1]: Target `prepare' not remade because of
> >> errors. make: *** [sub-make] Error 2
> >>
> >
> > There seems to be conflict in the new function "of_get_cpu_node" added.
> > PowerPC also defines the same function name. Further microblaze and
> > openrisc declares it(can be removed) but doesn't define it.
> > To fix this:
> > 1. I can rename the newly added function to something different like
> >`of_get_cpunode` or
> > 2. If of_* namespace should be used by only OF/FDT and not by any
> >architecture specific code, then the arch specific version can be
> >renamed to some thing like arch_of_get_cpu_node.
> >Also most of the calls to arch specific function can be moved to
> >generic code.
> >
> > Let me know your thoughts.
> 
> It is up to Rafael if he is willing/able to rebase his tree, but I
> would drop this series until this is sorted out.

Yeah, I've just done that.

> I think the new common function should be and can be generalized to work for
> powerpc.
> It would need to make reg property optional and pass in the device
> node to the arch specific function.
> 
> A short term solution would be just to make the function "#ifndef CONFIG_PPC".

I wouldn't do that, it's almost guaranteed to be messy going forward.

I'd go for 1 above personally.

Thanks,
Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/4] hotplug, x86: Fix online state in cpu0 debug interface

2013-08-17 Thread Rafael J. Wysocki
On Saturday, August 17, 2013 01:46:56 PM Toshi Kani wrote:
> _debug_hotplug_cpu() is a debug interface that puts cpu0 offline during
> boot-up when CONFIG_DEBUG_HOTPLUG_CPU0 is set.  After cpu0 is put offline
> in this interface, however, /sys/devices/system/cpu/cpu0/online still
> shows 1 (online).
> 
> This patch fixes _debug_hotplug_cpu() to update dev->offline when CPU
> online/offline operation succeeded.
> 
> Signed-off-by: Toshi Kani 

Acked-by: Rafael J. Wysocki 

> ---
>  arch/x86/kernel/topology.c |7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
> index 6e60b5f..5823bbd 100644
> --- a/arch/x86/kernel/topology.c
> +++ b/arch/x86/kernel/topology.c
> @@ -72,16 +72,19 @@ int __ref _debug_hotplug_cpu(int cpu, int action)
>   ret = cpu_down(cpu);
>   if (!ret) {
>   pr_info("CPU %u is now offline\n", cpu);
> + dev->offline = true;
>   kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
>   } else
>   pr_debug("Can't offline CPU%d.\n", cpu);
>   break;
>   case 1:
>   ret = cpu_up(cpu);
> - if (!ret)
> + if (!ret) {
> + dev->offline = false;
>   kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> - else
> + } else {
>   pr_debug("Can't online CPU%d.\n", cpu);
> + }
>   break;
>   default:
>   ret = -EINVAL;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/4] hotplug, x86: Disable ARCH_CPU_PROBE_RELEASE on x86

2013-08-17 Thread Rafael J. Wysocki
On Saturday, August 17, 2013 01:46:58 PM Toshi Kani wrote:
> Commit d7c53c9e enabled ARCH_CPU_PROBE_RELEASE on x86 in order to
> serialize CPU online/offline operations.  Although it is the config
> option to enable CPU hotplug test interfaces, probe & release, it is
> also the option to enable cpu_hotplug_driver_lock() as well.  Therefore,
> this option had to be enabled on x86 with dummy arch_cpu_probe() and
> arch_cpu_release().
> 
> Since then, lock_device_hotplug() was introduced to serialize CPU
> online/offline & hotplug operations.  Therefore, this config option
> is no longer required for the serialization.  This patch disables
> this config option on x86 and revert the changes made by commit
> d7c53c9e.
> 
> Signed-off-by: Toshi Kani 

Acked-by: Rafael J. Wysocki 

> ---
>  arch/x86/Kconfig  |4 
>  arch/x86/kernel/smpboot.c |   21 -
>  2 files changed, 25 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b32ebf9..c87e49a 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -255,10 +255,6 @@ config ARCH_HWEIGHT_CFLAGS
>   default "-fcall-saved-ecx -fcall-saved-edx" if X86_32
>   default "-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
> -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
> -fcall-saved-r11" if X86_64
>  
> -config ARCH_CPU_PROBE_RELEASE
> - def_bool y
> - depends on HOTPLUG_CPU
> -
>  config ARCH_SUPPORTS_UPROBES
>   def_bool y
>  
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index aecc98a..5b24a9d 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -82,27 +82,6 @@
>  /* State of each CPU */
>  DEFINE_PER_CPU(int, cpu_state) = { 0 };
>  
> -#ifdef CONFIG_HOTPLUG_CPU
> -/*
> - * We need this for trampoline_base protection from concurrent accesses when
> - * off- and onlining cores wildly.
> - */
> -static DEFINE_MUTEX(x86_cpu_hotplug_driver_mutex);
> -
> -void cpu_hotplug_driver_lock(void)
> -{
> - mutex_lock(&x86_cpu_hotplug_driver_mutex);
> -}
> -
> -void cpu_hotplug_driver_unlock(void)
> -{
> - mutex_unlock(&x86_cpu_hotplug_driver_mutex);
> -}
> -
> -ssize_t arch_cpu_probe(const char *buf, size_t count) { return -1; }
> -ssize_t arch_cpu_release(const char *buf, size_t count) { return -1; }
> -#endif
> -
>  /* Number of siblings per CPU package */
>  int smp_num_siblings = 1;
>  EXPORT_SYMBOL(smp_num_siblings);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/4] hotplug, x86: Add hotplug lock to missing places

2013-08-17 Thread Rafael J. Wysocki
On Saturday, August 17, 2013 01:46:57 PM Toshi Kani wrote:
> lock_device_hotplug() serializes CPU & Memory online/offline and hotplug
> operations.  However, this lock is not held in the debug interfaces below
> that initiate CPU online/offline operations.
> 
>  - _debug_hotplug_cpu(), cpu0 hotplug test interface enabled by
>CONFIG_DEBUG_HOTPLUG_CPU0.
>  - cpu_probe_store() and cpu_release_store(), cpu hotplug test interface
>enabled by CONFIG_ARCH_CPU_PROBE_RELEASE.
> 
> This patch changes the above interfaces to hold lock_device_hotplug().
> 
> Signed-off-by: Toshi Kani 

Acked-by: Rafael J. Wysocki 

> ---
>  arch/x86/kernel/topology.c |2 ++
>  drivers/base/cpu.c |   16 ++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
> index 5823bbd..a3f35eb 100644
> --- a/arch/x86/kernel/topology.c
> +++ b/arch/x86/kernel/topology.c
> @@ -65,6 +65,7 @@ int __ref _debug_hotplug_cpu(int cpu, int action)
>   if (!cpu_is_hotpluggable(cpu))
>   return -EINVAL;
>  
> + lock_device_hotplug();
>   cpu_hotplug_driver_lock();
>  
>   switch (action) {
> @@ -91,6 +92,7 @@ int __ref _debug_hotplug_cpu(int cpu, int action)
>   }
>  
>   cpu_hotplug_driver_unlock();
> + unlock_device_hotplug();
>  
>   return ret;
>  }
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 4c358bc..4cc6928 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -88,7 +88,13 @@ static ssize_t cpu_probe_store(struct device *dev,
>  const char *buf,
>  size_t count)
>  {
> - return arch_cpu_probe(buf, count);
> + ssize_t ret;
> +
> + lock_device_hotplug();
> + ret = arch_cpu_probe(buf, count);
> + unlock_device_hotplug();
> +
> + return ret;
>  }
>  
>  static ssize_t cpu_release_store(struct device *dev,
> @@ -96,7 +102,13 @@ static ssize_t cpu_release_store(struct device *dev,
>const char *buf,
>size_t count)
>  {
> - return arch_cpu_release(buf, count);
> + ssize_t ret;
> +
> + lock_device_hotplug();
> + ret = arch_cpu_release(buf, count);
> + unlock_device_hotplug();
> +
> + return ret;
>  }
>  
>  static DEVICE_ATTR(probe, S_IWUSR, NULL, cpu_probe_store);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/4] hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()

2013-08-17 Thread Rafael J. Wysocki
On Saturday, August 17, 2013 01:46:59 PM Toshi Kani wrote:
> cpu_hotplug_driver_lock() serializes CPU online/offline operations
> when ARCH_CPU_PROBE_RELEASE is set.  This lock interface is no longer
> necessary with the following reason:
> 
>  - lock_device_hotplug() now protects CPU online/offline operations,
>including the probe & release interfaces enabled by
>ARCH_CPU_PROBE_RELEASE.  The use of cpu_hotplug_driver_lock() is
>redundant.
>  - cpu_hotplug_driver_lock() is only valid when ARCH_CPU_PROBE_RELEASE
>is defined, which is misleading and is only enabled on powerpc.
> 
> This patch removes the cpu_hotplug_driver_lock() interface.  As
> a result, ARCH_CPU_PROBE_RELEASE only enables / disables the cpu
> probe & release interface as intended.  There is no functional change
> in this patch.
> 
> Signed-off-by: Toshi Kani 

Acked-by: Rafael J. Wysocki 

> ---
> Performed build test only on powerpc.
> ---
>  arch/powerpc/kernel/smp.c  |   12 --
>  arch/powerpc/platforms/pseries/dlpar.c |   40 
> 
>  arch/x86/kernel/topology.c |2 --
>  drivers/base/cpu.c |   10 +---
>  include/linux/cpu.h|   13 --
>  5 files changed, 16 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 38b0ba6..1667269 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -763,18 +763,6 @@ void __cpu_die(unsigned int cpu)
>   smp_ops->cpu_die(cpu);
>  }
>  
> -static DEFINE_MUTEX(powerpc_cpu_hotplug_driver_mutex);
> -
> -void cpu_hotplug_driver_lock()
> -{
> - mutex_lock(&powerpc_cpu_hotplug_driver_mutex);
> -}
> -
> -void cpu_hotplug_driver_unlock()
> -{
> - mutex_unlock(&powerpc_cpu_hotplug_driver_mutex);
> -}
> -
>  void cpu_die(void)
>  {
>   if (ppc_md.cpu_die)
> diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
> b/arch/powerpc/platforms/pseries/dlpar.c
> index a1a7b9a..e39325d 100644
> --- a/arch/powerpc/platforms/pseries/dlpar.c
> +++ b/arch/powerpc/platforms/pseries/dlpar.c
> @@ -387,18 +387,13 @@ static ssize_t dlpar_cpu_probe(const char *buf, size_t 
> count)
>   char *cpu_name;
>   int rc;
>  
> - cpu_hotplug_driver_lock();
>   rc = strict_strtoul(buf, 0, &drc_index);
> - if (rc) {
> - rc = -EINVAL;
> - goto out;
> - }
> + if (rc)
> + return -EINVAL;
>  
>   dn = dlpar_configure_connector(drc_index);
> - if (!dn) {
> - rc = -EINVAL;
> - goto out;
> - }
> + if (!dn)
> + return -EINVAL;
>  
>   /* configure-connector reports cpus as living in the base
>* directory of the device tree.  CPUs actually live in the
> @@ -407,8 +402,7 @@ static ssize_t dlpar_cpu_probe(const char *buf, size_t 
> count)
>   cpu_name = kasprintf(GFP_KERNEL, "/cpus%s", dn->full_name);
>   if (!cpu_name) {
>   dlpar_free_cc_nodes(dn);
> - rc = -ENOMEM;
> - goto out;
> + return -ENOMEM;
>   }
>  
>   kfree(dn->full_name);
> @@ -417,22 +411,21 @@ static ssize_t dlpar_cpu_probe(const char *buf, size_t 
> count)
>   rc = dlpar_acquire_drc(drc_index);
>   if (rc) {
>   dlpar_free_cc_nodes(dn);
> - rc = -EINVAL;
> - goto out;
> + return -EINVAL;
>   }
>  
>   rc = dlpar_attach_node(dn);
>   if (rc) {
>   dlpar_release_drc(drc_index);
>   dlpar_free_cc_nodes(dn);
> - goto out;
> + return rc;
>   }
>  
>   rc = dlpar_online_cpu(dn);
> -out:
> - cpu_hotplug_driver_unlock();
> + if (rc)
> + return rc;
>  
> - return rc ? rc : count;
> + return count;
>  }
>  
>  static int dlpar_offline_cpu(struct device_node *dn)
> @@ -505,30 +498,27 @@ static ssize_t dlpar_cpu_release(const char *buf, 
> size_t count)
>   return -EINVAL;
>   }
>  
> - cpu_hotplug_driver_lock();
>   rc = dlpar_offline_cpu(dn);
>   if (rc) {
>   of_node_put(dn);
> - rc = -EINVAL;
> - goto out;
> + return -EINVAL;
>   }
>  
>   rc = dlpar_release_drc(*drc_index);
>   if (rc) {
>   of_node_put(dn);
> - goto out;
> + return rc;
>   }
>  
>   rc = dlpar_detach_node(dn);
>   if (rc) {
>   dlpar_acquir

Re: [PATCH 0/4] Unify CPU hotplug lock interface

2013-08-17 Thread Rafael J. Wysocki
On Saturday, August 17, 2013 01:46:55 PM Toshi Kani wrote:
> lock_device_hotplug() was recently introduced to serialize CPU & Memory
> online/offline and hotplug operations, along with sysfs online interface
> restructure (commit 4f3549d7).  With this new locking scheme,
> cpu_hotplug_driver_lock() is redundant and is no longer necessary.
> 
> This patchset makes sure that lock_device_hotplug() covers all CPU online/
> offline interfaces, and then removes cpu_hotplug_driver_lock().
> 
> The patchset is based on Linus's tree, 3.11.0-rc5.

Nice series, thanks a lot for taking care of this!

Rafael


> ---
> Toshi Kani (4):
>   hotplug, x86: Fix online state in cpu0 debug interface
>   hotplug, x86: Add hotplug lock to missing places
>   hotplug, x86: Disable ARCH_CPU_PROBE_RELEASE on x86
>   hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()
> 
> ---
>  arch/powerpc/kernel/smp.c  | 12 --
>  arch/powerpc/platforms/pseries/dlpar.c | 40 
> +-
>  arch/x86/Kconfig   |  4 
>  arch/x86/kernel/smpboot.c  | 21 --
>  arch/x86/kernel/topology.c | 11 ++
>  drivers/base/cpu.c | 26 --
>  include/linux/cpu.h| 13 ---
>  7 files changed, 37 insertions(+), 90 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] i2c: move of helpers into the core

2013-08-19 Thread Rafael J. Wysocki
config OF_I2C
> - def_tristate I2C
> - depends on I2C
> - help
> -   OpenFirmware I2C accessors
> -
>  config OF_NET
>   depends on NETDEVICES
>   def_bool y
> diff --git a/drivers/of/Makefile b/drivers/of/Makefile
> index 1f9c0c4..efd0510 100644
> --- a/drivers/of/Makefile
> +++ b/drivers/of/Makefile
> @@ -3,7 +3,6 @@ obj-$(CONFIG_OF_FLATTREE) += fdt.o
>  obj-$(CONFIG_OF_PROMTREE) += pdt.o
>  obj-$(CONFIG_OF_ADDRESS)  += address.o
>  obj-$(CONFIG_OF_IRQ)+= irq.o
> -obj-$(CONFIG_OF_I2C) += of_i2c.o
>  obj-$(CONFIG_OF_NET) += of_net.o
>  obj-$(CONFIG_OF_SELFTEST) += selftest.o
>  obj-$(CONFIG_OF_MDIO)+= of_mdio.o
> diff --git a/drivers/of/of_i2c.c b/drivers/of/of_i2c.c
> deleted file mode 100644
> index b667264..000
> --- a/drivers/of/of_i2c.c
> +++ /dev/null
> @@ -1,114 +0,0 @@
> -/*
> - * OF helpers for the I2C API
> - *
> - * Copyright (c) 2008 Jochen Friedrich 
> - *
> - * Based on a previous patch from Jon Smirl 
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -void of_i2c_register_devices(struct i2c_adapter *adap)
> -{
> - void *result;
> - struct device_node *node;
> -
> - /* Only register child devices if the adapter has a node pointer set */
> - if (!adap->dev.of_node)
> - return;
> -
> - dev_dbg(&adap->dev, "of_i2c: walking child nodes\n");
> -
> - for_each_available_child_of_node(adap->dev.of_node, node) {
> - struct i2c_board_info info = {};
> - struct dev_archdata dev_ad = {};
> - const __be32 *addr;
> - int len;
> -
> - dev_dbg(&adap->dev, "of_i2c: register %s\n", node->full_name);
> -
> - if (of_modalias_node(node, info.type, sizeof(info.type)) < 0) {
> - dev_err(&adap->dev, "of_i2c: modalias failure on %s\n",
> - node->full_name);
> - continue;
> - }
> -
> - addr = of_get_property(node, "reg", &len);
> - if (!addr || (len < sizeof(int))) {
> - dev_err(&adap->dev, "of_i2c: invalid reg on %s\n",
> - node->full_name);
> - continue;
> - }
> -
> - info.addr = be32_to_cpup(addr);
> - if (info.addr > (1 << 10) - 1) {
> - dev_err(&adap->dev, "of_i2c: invalid addr=%x on %s\n",
> - info.addr, node->full_name);
> - continue;
> - }
> -
> - info.irq = irq_of_parse_and_map(node, 0);
> - info.of_node = of_node_get(node);
> - info.archdata = &dev_ad;
> -
> - if (of_get_property(node, "wakeup-source", NULL))
> - info.flags |= I2C_CLIENT_WAKE;
> -
> - request_module("%s%s", I2C_MODULE_PREFIX, info.type);
> -
> - result = i2c_new_device(adap, &info);
> - if (result == NULL) {
> - dev_err(&adap->dev, "of_i2c: Failure registering %s\n",
> - node->full_name);
> - of_node_put(node);
> - irq_dispose_mapping(info.irq);
> - continue;
> - }
> - }
> -}
> -EXPORT_SYMBOL(of_i2c_register_devices);
> -
> -static int of_dev_node_match(struct device *dev, void *data)
> -{
> -return dev->of_node == data;
> -}
> -
> -/* must call put_device() when done with returned i2c_client device */
> -struct i2c_client *of_find_i2c_device_by_node(struct device_node *node)
> -{
> - struct device *dev;
> -
> - dev = bus_find_device(&i2c_bus_type, NULL, node,
> -  of_dev_node_match);
> - if (!dev)
> - return NULL;
> -
> - return i2c_verify_client(dev);
> -}
> -EXPORT_SYMBOL(of_find_i2c_device_by_node);
> -
> -/* must call put_device() when done with returned i2c_adapter device */
> -struct i2c_adapter *of_find_i2c_adapter_by_node(struct device_node *node)
> -{
> - struct device *dev;
> -
> - dev = bus_find_device(&i2c_bus_type, NULL, node,
> -  of_dev_node_match);
> - if (!dev)
> - return NULL;
> -
> - return i2c_verify_adapter(dev);
> -}
> -EXPORT_SYMBOL(of_find_i2c_adapter_by_node);
> -
> -MODULE_LICENSE("GPL");
> diff --git a/include/linux/i2c.h b/include/linux/i2c.h
> index e988fa9..2189189 100644
> --- a/include/linux/i2c.h
> +++ b/include/linux/i2c.h
> @@ -542,6 +542,26 @@ static inline int i2c_adapter_id(struct i2c_adapter 
> *adap)
>  
>  #endif /* I2C */
>  
> +#if IS_ENABLED(CONFIG_OF)
> +/* must call put_device() when done with returned i2c_client device */
> +extern struct i2c_client *of_find_i2c_device_by_node(struct device_node 
> *node);
> +
> +/* must call put_device() when done with returned i2c_adapter device */
> +extern struct i2c_adapter *of_find_i2c_adapter_by_node(struct device_node 
> *node);
> +
> +#else
> +
> +static inline struct i2c_client *of_find_i2c_device_by_node(struct 
> device_node *node)
> +{
> + return NULL;
> +}
> +
> +static inline struct i2c_adapter *of_find_i2c_adapter_by_node(struct 
> device_node *node)
> +{
> + return NULL;
> +}
> +#endif /* CONFIG_OF */
> +
>  #if IS_ENABLED(CONFIG_ACPI_I2C)
>  extern void acpi_i2c_register_devices(struct i2c_adapter *adap);
>  #else
> diff --git a/include/linux/of_i2c.h b/include/linux/of_i2c.h
> deleted file mode 100644
> index cfb545c..000
> --- a/include/linux/of_i2c.h
> +++ /dev/null
> @@ -1,46 +0,0 @@
> -/*
> - * Generic I2C API implementation for PowerPC.
> - *
> - * Copyright (c) 2008 Jochen Friedrich 
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> - */
> -
> -#ifndef __LINUX_OF_I2C_H
> -#define __LINUX_OF_I2C_H
> -
> -#if defined(CONFIG_OF_I2C) || defined(CONFIG_OF_I2C_MODULE)
> -#include 
> -
> -extern void of_i2c_register_devices(struct i2c_adapter *adap);
> -
> -/* must call put_device() when done with returned i2c_client device */
> -extern struct i2c_client *of_find_i2c_device_by_node(struct device_node 
> *node);
> -
> -/* must call put_device() when done with returned i2c_adapter device */
> -extern struct i2c_adapter *of_find_i2c_adapter_by_node(
> - struct device_node *node);
> -
> -#else
> -static inline void of_i2c_register_devices(struct i2c_adapter *adap)
> -{
> - return;
> -}
> -
> -static inline struct i2c_client *of_find_i2c_device_by_node(struct 
> device_node *node)
> -{
> - return NULL;
> -}
> -
> -/* must call put_device() when done with returned i2c_adapter device */
> -static inline struct i2c_adapter *of_find_i2c_adapter_by_node(
> - struct device_node *node)
> -{
> - return NULL;
> -}
> -#endif /* CONFIG_OF_I2C */
> -
> -#endif /* __LINUX_OF_I2C_H */
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v4 03/19] powerpc: refactor of_get_cpu_node to support other architectures

2013-08-20 Thread Rafael J. Wysocki
   if (thread)
> - *thread = t;
> -             return np;
> - }
> - }
> - }
> + if (__of_find_n_match_cpu_property(cpun,
> + "ibm,ppc-interrupt-server#s", cpu, thread))
> + return cpun;
> +
> + if (__of_find_n_match_cpu_property(cpun, "reg", cpu, thread))
> + return cpun;
>   }
>   return NULL;
>  }
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v4 06/19] driver/core: cpu: initialize of_node in cpu's device struture

2013-08-20 Thread Rafael J. Wysocki
On Tuesday, August 20, 2013 10:30:08 AM Sudeep KarkadaNagesha wrote:
> From: Sudeep KarkadaNagesha 
> 
> CPUs are also registered as devices but the of_node in these cpu
> devices are not initialized. Currently different drivers requiring
> to access cpu device node are parsing the nodes themselves and
> initialising the of_node in cpu device.
> 
> The of_node in all the cpu devices needs to be initialized properly
> and at one place. The best place to update this is CPU subsystem
> driver when registering the cpu devices.
> 
> The OF/DT core library now provides of_get_cpu_node to retrieve a cpu
> device node for a given logical index by abstracting the architecture
> specific details.
> 
> This patch uses of_get_cpu_node to assign of_node when registering the
> cpu devices.
> 
> Cc: Greg Kroah-Hartman 
> Acked-by: Rob Herring 
> Signed-off-by: Sudeep KarkadaNagesha 

Hi Greg,

I this one fine with you?

Rafael


> ---
>  drivers/base/cpu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 4c358bc..4cf0717 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "base.h"
>  
> @@ -289,6 +290,7 @@ int register_cpu(struct cpu *cpu, int num)
>   cpu->dev.release = cpu_device_release;
>   cpu->dev.offline_disabled = !cpu->hotpluggable;
>   cpu->dev.offline = !cpu_online(num);
> + cpu->dev.of_node = of_get_cpu_node(num, NULL);
>  #ifdef CONFIG_ARCH_HAS_CPU_AUTOPROBE
>   cpu->dev.bus->uevent = arch_cpu_uevent;
>  #endif
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [GIT PULL v2] DT/core: cpu_ofnode updates for v3.12

2013-08-22 Thread Rafael J. Wysocki
On Thursday, August 22, 2013 02:57:56 PM Sudeep KarkadaNagesha wrote:
> Hi Rafael,
> 
> Here is the v2 of the pull request for cpu of_node updates for v3.12
> It includes ACK for all the new changes since v1(mainly from Ben for
> PPC). Currently there's trivial conflict with today's linux-next in 3
> files. Let me know if you need me to rebase this on any particular
> branch if needed.
> 
> Regards,
> Sudeep
> 
> The following changes since commit b36f4be3de1b123d8601de062e7dbfc904f305fb:
> 
>   Linux 3.11-rc6 (2013-08-18 14:36:53 -0700)
> 
> are available in the git repository at:
> 
>   git://linux-arm.org/linux-skn.git cpu_of_node
> 
> for you to fetch changes up to 1037b2752345cc5666e90b711a913ab2ae6c5920:
> 
>   cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
> (2013-08-21 10:29:56 +0100)
> 
> 
> Sudeep KarkadaNagesha (19):
>microblaze: remove undefined of_get_cpu_node declaration
>openrisc: remove undefined of_get_cpu_node declaration
>powerpc: refactor of_get_cpu_node to support other architectures
>of: move of_get_cpu_node implementation to DT core library
>ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
>driver/core: cpu: initialize of_node in cpu's device struture
>of/device: add helper to get cpu device node from logical cpu index
>ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
>ARM: mvebu: remove device tree parsing for cpu nodes
>drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
>cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
>cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
>cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
>cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
>cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
>cpufreq: arm_big_little: remove device tree parsing for cpu nodes
>cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
>cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
>cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
> 
>  arch/arm/kernel/devtree.c   |  5 ++
>  arch/arm/kernel/topology.c  | 61 +---
>  arch/arm/mach-imx/mach-imx6q.c  |  3 +-
>  arch/arm/mach-mvebu/platsmp.c   | 51 ++---
>  arch/microblaze/include/asm/prom.h  |  3 -
>  arch/openrisc/include/asm/prom.h|  3 -
>  arch/powerpc/include/asm/prom.h |  3 -
>  arch/powerpc/kernel/prom.c  | 43 +---
>  drivers/base/cpu.c  |  2 +
>  drivers/bus/arm-cci.c   | 28 ++--
>  drivers/cpufreq/arm_big_little_dt.c | 40 ---
>  drivers/cpufreq/cpufreq-cpu0.c  | 23 +-
>  drivers/cpufreq/highbank-cpufreq.c  | 18 ++---
>  drivers/cpufreq/imx6q-cpufreq.c |  4 +-
>  drivers/cpufreq/kirkwood-cpufreq.c  |  8 ++-
>  drivers/cpufreq/maple-cpufreq.c | 23 +-
>  drivers/cpufreq/pmac32-cpufreq.c|  5 +-
>  drivers/cpufreq/pmac64-cpufreq.c| 47 +++-
>  drivers/cpufreq/spear-cpufreq.c |  4 +-
>  drivers/of/base.c   | 95 
>  include/linux/cpu.h |  1 +
>  include/linux/of.h  |  7 ++
>  include/linux/of_device.h   | 15 
>  23 files changed, 226 insertions(+), 266 deletions(-)

Pulled, thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/4] Unify CPU hotplug lock interface

2013-08-29 Thread Rafael J. Wysocki
On Thursday, August 29, 2013 11:15:10 AM Toshi Kani wrote:
> On Sun, 2013-08-18 at 03:02 +0200, Rafael J. Wysocki wrote:
> > On Saturday, August 17, 2013 01:46:55 PM Toshi Kani wrote:
> > > lock_device_hotplug() was recently introduced to serialize CPU & Memory
> > > online/offline and hotplug operations, along with sysfs online interface
> > > restructure (commit 4f3549d7).  With this new locking scheme,
> > > cpu_hotplug_driver_lock() is redundant and is no longer necessary.
> > > 
> > > This patchset makes sure that lock_device_hotplug() covers all CPU online/
> > > offline interfaces, and then removes cpu_hotplug_driver_lock().
> > > 
> > > The patchset is based on Linus's tree, 3.11.0-rc5.
> > 
> > Nice series, thanks a lot for taking care of this!
> 
> Hi Rafael,
> 
> Per the recent your changes in lock_device_hotplug(), do you think it
> makes sense to integrate this patchset into your tree?  I am also
> considering to add one more patch to use lock_device_hotplug_sysfs() in
> cpu_probe_store().  I will rebase to your tree and send them today if it
> makes sense to you.

Yes, it does to me.

Thanks,
Rafael


> > > ---
> > > Toshi Kani (4):
> > >   hotplug, x86: Fix online state in cpu0 debug interface
> > >   hotplug, x86: Add hotplug lock to missing places
> > >   hotplug, x86: Disable ARCH_CPU_PROBE_RELEASE on x86
> > >   hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()
> > > 
> > > ---
> > >  arch/powerpc/kernel/smp.c  | 12 --
> > >  arch/powerpc/platforms/pseries/dlpar.c | 40 
> > > +-
> > >  arch/x86/Kconfig   |  4 
> > >  arch/x86/kernel/smpboot.c  | 21 --
> > >  arch/x86/kernel/topology.c | 11 ++
> > >  drivers/base/cpu.c | 26 --
> > >  include/linux/cpu.h| 13 ---
> > >  7 files changed, 37 insertions(+), 90 deletions(-)
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to majord...@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/4] Unify CPU hotplug lock interface

2013-08-30 Thread Rafael J. Wysocki
On Friday, August 30, 2013 11:44:06 AM Yasuaki Ishimatsu wrote:
> (2013/08/30 9:22), Toshi Kani wrote:
> > lock_device_hotplug() was recently introduced to serialize CPU & Memory
> > online/offline and hotplug operations, along with sysfs online interface
> > restructure (commit 4f3549d7).  With this new locking scheme,
> > cpu_hotplug_driver_lock() is redundant and is no longer necessary.
> > 
> > This patchset makes sure that lock_device_hotplug() covers all CPU online/
> > offline interfaces, and then removes cpu_hotplug_driver_lock().
> > 
> > v2:
> >   - Rebased to the pm tree, bleeding-edge.
> >   - Changed patch 2/4 to use lock_device_hotplug_sysfs().
> > 
> > ---
> > Toshi Kani (4):
> >hotplug, x86: Fix online state in cpu0 debug interface
> >hotplug, x86: Add hotplug lock to missing places
> >hotplug, x86: Disable ARCH_CPU_PROBE_RELEASE on x86
> >hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()
> > 
> > ---
> The patch-set looks good to me.
> 
> Acked-by: Yasuaki Ishimatsu 

Thanks!  I'll tentatively queue up this series for 3.12 (for the second
half of the merge window).

Thanks,
Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/4] hotplug, x86: Fix online state in cpu0 debug interface

2013-08-30 Thread Rafael J. Wysocki
Hi Peter,

Any objections here?

On Thursday, August 29, 2013 06:22:06 PM Toshi Kani wrote:
> _debug_hotplug_cpu() is a debug interface that puts cpu0 offline during
> boot-up when CONFIG_DEBUG_HOTPLUG_CPU0 is set.  After cpu0 is put offline
> in this interface, however, /sys/devices/system/cpu/cpu0/online still
> shows 1 (online).
> 
> This patch fixes _debug_hotplug_cpu() to update dev->offline when CPU
> online/offline operation succeeded.
> 
> Signed-off-by: Toshi Kani 
> Acked-by: Rafael J. Wysocki 
> ---
>  arch/x86/kernel/topology.c |7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
> index 6e60b5f..5823bbd 100644
> --- a/arch/x86/kernel/topology.c
> +++ b/arch/x86/kernel/topology.c
> @@ -72,16 +72,19 @@ int __ref _debug_hotplug_cpu(int cpu, int action)
>   ret = cpu_down(cpu);
>   if (!ret) {
>   pr_info("CPU %u is now offline\n", cpu);
> + dev->offline = true;
>   kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
>   } else
>   pr_debug("Can't offline CPU%d.\n", cpu);
>   break;
>   case 1:
>   ret = cpu_up(cpu);
> - if (!ret)
> + if (!ret) {
> + dev->offline = false;
>   kobject_uevent(&dev->kobj, KOBJ_ONLINE);
> - else
> + } else {
>   pr_debug("Can't online CPU%d.\n", cpu);
> + }
>   break;
>   default:
>   ret = -EINVAL;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 4/4] hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()

2013-09-25 Thread Rafael J. Wysocki
Hi,

On Thursday, August 29, 2013 06:22:09 PM Toshi Kani wrote:
> cpu_hotplug_driver_lock() serializes CPU online/offline operations
> when ARCH_CPU_PROBE_RELEASE is set.  This lock interface is no longer
> necessary with the following reason:
> 
>  - lock_device_hotplug() now protects CPU online/offline operations,
>including the probe & release interfaces enabled by
>ARCH_CPU_PROBE_RELEASE.  The use of cpu_hotplug_driver_lock() is
>redundant.
>  - cpu_hotplug_driver_lock() is only valid when ARCH_CPU_PROBE_RELEASE
>is defined, which is misleading and is only enabled on powerpc.
> 
> This patch removes the cpu_hotplug_driver_lock() interface.  As
> a result, ARCH_CPU_PROBE_RELEASE only enables / disables the cpu
> probe & release interface as intended.  There is no functional change
> in this patch.
> 
> Signed-off-by: Toshi Kani 
> Acked-by: Rafael J. Wysocki 
> Reviewed-by: Nathan Fontenot 

Can you please rebase this patch on top of 3.12-rc2?  It doesn't apply for
me any more.

Thanks,
Rafael


> ---
> Performed build test only on powerpc.
> ---
>  arch/powerpc/kernel/smp.c  |   12 --
>  arch/powerpc/platforms/pseries/dlpar.c |   40 
> 
>  arch/x86/kernel/topology.c |2 --
>  drivers/base/cpu.c |   10 +---
>  include/linux/cpu.h|   13 --
>  5 files changed, 16 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 38b0ba6..1667269 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -763,18 +763,6 @@ void __cpu_die(unsigned int cpu)
>   smp_ops->cpu_die(cpu);
>  }
>  
> -static DEFINE_MUTEX(powerpc_cpu_hotplug_driver_mutex);
> -
> -void cpu_hotplug_driver_lock()
> -{
> - mutex_lock(&powerpc_cpu_hotplug_driver_mutex);
> -}
> -
> -void cpu_hotplug_driver_unlock()
> -{
> - mutex_unlock(&powerpc_cpu_hotplug_driver_mutex);
> -}
> -
>  void cpu_die(void)
>  {
>   if (ppc_md.cpu_die)
> diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
> b/arch/powerpc/platforms/pseries/dlpar.c
> index a1a7b9a..e39325d 100644
> --- a/arch/powerpc/platforms/pseries/dlpar.c
> +++ b/arch/powerpc/platforms/pseries/dlpar.c
> @@ -387,18 +387,13 @@ static ssize_t dlpar_cpu_probe(const char *buf, size_t 
> count)
>   char *cpu_name;
>   int rc;
>  
> - cpu_hotplug_driver_lock();
>   rc = strict_strtoul(buf, 0, &drc_index);
> - if (rc) {
> - rc = -EINVAL;
> - goto out;
> - }
> + if (rc)
> + return -EINVAL;
>  
>   dn = dlpar_configure_connector(drc_index);
> - if (!dn) {
> - rc = -EINVAL;
> - goto out;
> - }
> + if (!dn)
> + return -EINVAL;
>  
>   /* configure-connector reports cpus as living in the base
>* directory of the device tree.  CPUs actually live in the
> @@ -407,8 +402,7 @@ static ssize_t dlpar_cpu_probe(const char *buf, size_t 
> count)
>   cpu_name = kasprintf(GFP_KERNEL, "/cpus%s", dn->full_name);
>   if (!cpu_name) {
>   dlpar_free_cc_nodes(dn);
> - rc = -ENOMEM;
> - goto out;
> + return -ENOMEM;
>   }
>  
>   kfree(dn->full_name);
> @@ -417,22 +411,21 @@ static ssize_t dlpar_cpu_probe(const char *buf, size_t 
> count)
>   rc = dlpar_acquire_drc(drc_index);
>   if (rc) {
>   dlpar_free_cc_nodes(dn);
> - rc = -EINVAL;
> - goto out;
> + return -EINVAL;
>   }
>  
>   rc = dlpar_attach_node(dn);
>   if (rc) {
>   dlpar_release_drc(drc_index);
>   dlpar_free_cc_nodes(dn);
> - goto out;
> + return rc;
>   }
>  
>   rc = dlpar_online_cpu(dn);
> -out:
> - cpu_hotplug_driver_unlock();
> + if (rc)
> + return rc;
>  
> - return rc ? rc : count;
> + return count;
>  }
>  
>  static int dlpar_offline_cpu(struct device_node *dn)
> @@ -505,30 +498,27 @@ static ssize_t dlpar_cpu_release(const char *buf, 
> size_t count)
>   return -EINVAL;
>   }
>  
> - cpu_hotplug_driver_lock();
>   rc = dlpar_offline_cpu(dn);
>   if (rc) {
>   of_node_put(dn);
> - rc = -EINVAL;
> - goto out;
> + return -EINVAL;
>   }
>  
>   rc = dlpar_release_drc(*drc_index);
>   if (rc) {
>   of_node_put(dn);
> - goto out

Re: Please revert 928bea964827d7824b548c1f8e06eccbbc4d0d7d

2013-10-03 Thread Rafael J. Wysocki
On Friday, September 27, 2013 04:44:20 PM Yinghai Lu wrote:
> [+ Rafael]
> 
> On Fri, Sep 27, 2013 at 4:19 PM, Benjamin Herrenschmidt
>  wrote:
> > On Fri, 2013-09-27 at 15:56 -0700, Yinghai Lu wrote:
> >
> >> ok, please if you are ok attached one instead. It will print some warning 
> >> about
> >> driver skipping pci_set_master, so we can catch more problem with drivers.
> >
> > Except that the message is pretty cryptic :-) Especially since the
> > driver causing the message to be printed is not the one that did
> > the mistake in the first place, it's the next one coming up that
> > trips the warning.
> >
> > In any case, the root cause is indeed the PCIe port driver:
> >
> > We don't have ACPI, so pcie_port_platform_notify() isn't implemented,
> > and pcie_ports_auto is true, so we end up with capabilities set to 0.
> 
> in
> | commit fe31e69740eddc7316071ed5165fed6703c8cd12
> | Author: Rafael J. Wysocki 
> | Date:   Sun Dec 19 15:57:16 2010 +0100
> |
> |PCI/PCIe: Clear Root PME Status bits early during system resume
> |
> |I noticed that PCI Express PMEs don't work on my Toshiba Portege R500
> |after the system has been woken up from a sleep state by a PME
> |(through Wake-on-LAN).  After some investigation it turned out that
> |the BIOS didn't clear the Root PME Status bit in the root port that
> |received the wakeup PME and since the Requester ID was also set in
> |the port's Root Status register, any subsequent PMEs didn't trigger
> |interrupts.
> |
> |This problem can be avoided by clearing the Root PME Status bits in
> |all PCI Express root ports during early resume.  For this purpose,
> |add an early resume routine to the PCIe port driver and make this
> |driver be always registered, even if pci_ports_disable is set (in
> |which case the driver's only function is to provide the early
> |resume callback).
> |
> |
> |@@ -349,15 +349,18 @@ int pcie_port_device_register(struct pci_dev *dev)
> |int status, capabilities, i, nr_service;
> |int irqs[PCIE_PORT_DEVICE_MAXSERVICES];
> |
> |-   /* Get and check PCI Express port services */
> |-   capabilities = get_port_device_capability(dev);
> |-   if (!capabilities)
> |-   return -ENODEV;
> |-
> |/* Enable PCI Express port device */
> |status = pci_enable_device(dev);
> |if (status)
> |return status;
> |+
> |+   /* Get and check PCI Express port services */
> |+   capabilities = get_port_device_capability(dev);
> |+   if (!capabilities) {
> |+   pcie_no_aspm();
> |+   return 0;
> |+   }
> |+
> |pci_set_master(dev);
> |/*
> | * Initialize service irqs. Don't use service devices that
> 
> >
> > Thus the port driver bails out before calling pci_set_master(). The fix
> > is to call pci_set_master() unconditionally. However that lead me to
> > find to a few interesting oddities in that port driver code:
> 
> can we revert that partially change ? aka we should check get_port
> at first...
> 
> like attached.

It looks like we can do something like this (just pasting your patch):

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 31063ac..1ee6f16 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -362,16 +362,16 @@ int pcie_port_device_register(struct pci_dev *dev)
int status, capabilities, i, nr_service;
int irqs[PCIE_PORT_DEVICE_MAXSERVICES];
 
-   /* Enable PCI Express port device */
-   status = pci_enable_device(dev);
-   if (status)
-   return status;
-
/* Get and check PCI Express port services */
capabilities = get_port_device_capability(dev);
if (!capabilities)
return 0;
 
+   /* Enable PCI Express port device */
+   status = pci_enable_device(dev);
+   if (status)
+   return status;
+
pci_set_master(dev);
/*
 * Initialize service irqs. Don't use service devices that

but I don't have that box with me to test whether or not it still works
correctly after this change.  I'll be back home on the next Saturday if
all goes well.

Thanks,
Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 RESEND 0/3] cpufreq/ondemand support for Xserve G5 & iMac G5 iSight

2013-10-03 Thread Rafael J. Wysocki
On Monday, September 30, 2013 11:44:30 PM Aaro Koskinen wrote:
> Hi,
> 
> This is a resend of the v2 patchset:
> 
>   http://marc.info/?t=13779701321&r=1&w=2
> 
> No changes except rebasing. Any chance to get these into v3.13?

Well, Ben's ACKs are missing as far as I'm concerned ...

> Aaro Koskinen (3):
>   cpufreq: pmac64: speed up frequency switch
>   cpufreq: pmac64: provide cpufreq transition latency for older G5
> models
>   cpufreq: pmac64: enable cpufreq on iMac G5 (iSight) model
> 
>  drivers/cpufreq/pmac64-cpufreq.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 RESEND 0/3] cpufreq/ondemand support for Xserve G5 & iMac G5 iSight

2013-10-16 Thread Rafael J. Wysocki
On Friday, October 11, 2013 03:32:15 PM Benjamin Herrenschmidt wrote:
> On Mon, 2013-09-30 at 23:44 +0300, Aaro Koskinen wrote:
> > Hi,
> > 
> > This is a resend of the v2 patchset:
> > 
> > http://marc.info/?t=13779701321&r=1&w=2
> > 
> > No changes except rebasing. Any chance to get these into v3.13?
> 
> BTW. Ack from me, Rafael feel free to merge these.

Queued up for 3.13, sorry for the delay.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [Suggestion] drivers: powercap: 'dev_attrs' has already removed from 'struct class'

2013-10-31 Thread Rafael J. Wysocki

On 10/31/2013 3:18 AM, Chen Gang wrote:

Hello Maintainers

It is removed by "bcc8edb driver core: remove dev_attrs from struct
class" in Oct 5 2013. But "75d2364 PowerCap: Add class driver" still
use it in Oct 11 2013.

The related error (for powerpc with allmodconfig):

   CC  drivers/powercap/powercap_sys.o
drivers/powercap/powercap_sys.c:484:2: error: unknown field ‘dev_attrs’ 
specified in initializer
drivers/powercap/powercap_sys.c:484:2: warning: initialization from 
incompatible pointer type [enabled by default]
drivers/powercap/powercap_sys.c:484:2: warning: (near initialization for 
‘powercap_class.suspend’) [enabled by default]


Please give a check thanks.


That should have been fixed in the current linux-next already.

Thanks,
Rafael

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 02/12] ACPI: Add sys_hotplug.h for system device hotplug framework

2013-01-11 Thread Rafael J. Wysocki
On Thursday, January 10, 2013 04:40:20 PM Toshi Kani wrote:
> Added include/acpi/sys_hotplug.h, which is ACPI-specific system
> device hotplug header and defines the order values of ACPI-specific
> handlers.
> 
> Signed-off-by: Toshi Kani 
> ---
>  include/acpi/sys_hotplug.h |   48 
> 
>  1 file changed, 48 insertions(+)
>  create mode 100644 include/acpi/sys_hotplug.h
> 
> diff --git a/include/acpi/sys_hotplug.h b/include/acpi/sys_hotplug.h
> new file mode 100644
> index 000..ad80f61
> --- /dev/null
> +++ b/include/acpi/sys_hotplug.h
> @@ -0,0 +1,48 @@
> +/*
> + * sys_hotplug.h - ACPI System device hot-plug framework
> + *
> + * Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
> + *   Toshi Kani 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef _ACPI_SYS_HOTPLUG_H
> +#define _ACPI_SYS_HOTPLUG_H
> +
> +#include 
> +#include 
> +#include 
> +
> +/*
> + * System device hot-plug operation proceeds in the following order.
> + *   Validate phase -> Execute phase -> Commit phase
> + *
> + * The order values below define the calling sequence of ACPI-specific
> + * handlers for each phase in ascending order.  The order value of
> + * platform-neutral handlers are defined in .
> + */
> +
> +/* Add Validate order values */
> +#define SHP_ACPI_BUS_ADD_VALIDATE_ORDER  0   /* must be 
> first */
> +
> +/* Add Execute order values */
> +#define SHP_ACPI_BUS_ADD_EXECUTE_ORDER   10
> +#define SHP_ACPI_RES_ADD_EXECUTE_ORDER   20
> +
> +/* Add Commit order values */
> +#define SHP_ACPI_BUS_ADD_COMMIT_ORDER10
> +
> +/* Delete Validate order values */
> +#define SHP_ACPI_BUS_DEL_VALIDATE_ORDER  0   /* must be 
> first */
> +#define SHP_ACPI_RES_DEL_VALIDATE_ORDER  10
> +
> +/* Delete Execute order values */
> +#define SHP_ACPI_BUS_DEL_EXECUTE_ORDER   100
> +
> +/* Delete Commit order values */
> +#define SHP_ACPI_BUS_DEL_COMMIT_ORDER100
> +
> +#endif   /* _ACPI_SYS_HOTPLUG_H */
> --

Why did you use the particular values above?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-01-11 Thread Rafael J. Wysocki
 length;
> + } mem;
> +
> + struct shp_hostbridge {
> + } hb;
> +
> + struct shp_node {
> + } node;
> +};
> +
> +struct shp_device {
> + struct list_headlist;
> + struct device   *device;
> + enum shp_class  class;
> + union shp_dev_info  info;
> +};
> +
> +/*
> + * Hot-plug request
> + */
> +struct shp_request {
> + /* common info */
> + enum shp_operation  operation;  /* operation */
> +
> + /* hot-plug event info: only valid for hot-plug operations */
> + void*handle;/* FW handle */

What's the role of handle here?


> + u32 event;  /* FW event */
> +
> + /* device resource info */
> + struct list_headdev_list;   /* shp_device list */
> +};
> +
> +/*
> + * Inline Utility Functions
> + */
> +static inline bool shp_is_hotplug_op(enum shp_operation operation)
> +{
> + return (operation & SHP_OP_MASK) == SHP_OP_HOTPLUG;
> +}
> +
> +static inline bool shp_is_online_op(enum shp_operation operation)
> +{
> + return (operation & SHP_OP_MASK) == SHP_OP_ONLINE;
> +}
> +
> +static inline bool shp_is_add_op(enum shp_operation operation)
> +{
> + return (operation & SHP_REQ_MASK) == SHP_REQ_ADD;
> +}
> +
> +static inline bool shp_is_add_phase(enum shp_phase phase)
> +{
> + return (phase & SHP_REQ_MASK) == SHP_REQ_ADD;
> +}
> +
> +/*
> + * Externs
> + */
> +typedef int (*shp_func)(struct shp_request *req, int rollback);
> +extern int shp_register_handler(enum shp_phase phase, shp_func func, u32 
> order);
> +extern int shp_unregister_handler(enum shp_phase phase, shp_func func);
> +extern int shp_submit_req(struct shp_request *req);
> +extern struct shp_request *shp_alloc_request(enum shp_operation operation);
> +extern void shp_add_dev_info(struct shp_request *shp_req,
> + struct shp_device *shp_dev);
> +
> +#endif   /* _LINUX_SYS_HOTPLUG_H */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 02/12] ACPI: Add sys_hotplug.h for system device hotplug framework

2013-01-14 Thread Rafael J. Wysocki
On Monday, January 14, 2013 08:53:53 AM Toshi Kani wrote:
> On Fri, 2013-01-11 at 22:25 +0100, Rafael J. Wysocki wrote:
> > On Thursday, January 10, 2013 04:40:20 PM Toshi Kani wrote:
> > > Added include/acpi/sys_hotplug.h, which is ACPI-specific system
> > > device hotplug header and defines the order values of ACPI-specific
> > > handlers.
> > > 
> > > Signed-off-by: Toshi Kani 
> > > ---
> > >  include/acpi/sys_hotplug.h |   48 
> > > 
> > >  1 file changed, 48 insertions(+)
> > >  create mode 100644 include/acpi/sys_hotplug.h
> > > 
> > > diff --git a/include/acpi/sys_hotplug.h b/include/acpi/sys_hotplug.h
> > > new file mode 100644
> > > index 000..ad80f61
> > > --- /dev/null
> > > +++ b/include/acpi/sys_hotplug.h
> > > @@ -0,0 +1,48 @@
> > > +/*
> > > + * sys_hotplug.h - ACPI System device hot-plug framework
> > > + *
> > > + * Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
> > > + *   Toshi Kani 
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2 as
> > > + * published by the Free Software Foundation.
> > > + */
> > > +
> > > +#ifndef _ACPI_SYS_HOTPLUG_H
> > > +#define _ACPI_SYS_HOTPLUG_H
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +/*
> > > + * System device hot-plug operation proceeds in the following order.
> > > + *   Validate phase -> Execute phase -> Commit phase
> > > + *
> > > + * The order values below define the calling sequence of ACPI-specific
> > > + * handlers for each phase in ascending order.  The order value of
> > > + * platform-neutral handlers are defined in .
> > > + */
> > > +
> > > +/* Add Validate order values */
> > > +#define SHP_ACPI_BUS_ADD_VALIDATE_ORDER  0   /* must be 
> > > first */
> > > +
> > > +/* Add Execute order values */
> > > +#define SHP_ACPI_BUS_ADD_EXECUTE_ORDER   10
> > > +#define SHP_ACPI_RES_ADD_EXECUTE_ORDER   20
> > > +
> > > +/* Add Commit order values */
> > > +#define SHP_ACPI_BUS_ADD_COMMIT_ORDER10
> > > +
> > > +/* Delete Validate order values */
> > > +#define SHP_ACPI_BUS_DEL_VALIDATE_ORDER  0   /* must be 
> > > first */
> > > +#define SHP_ACPI_RES_DEL_VALIDATE_ORDER  10
> > > +
> > > +/* Delete Execute order values */
> > > +#define SHP_ACPI_BUS_DEL_EXECUTE_ORDER   100
> > > +
> > > +/* Delete Commit order values */
> > > +#define SHP_ACPI_BUS_DEL_COMMIT_ORDER100
> > > +
> > > +#endif   /* _ACPI_SYS_HOTPLUG_H */
> > > --
> > 
> > Why did you use the particular values above?
> 
> The ordering values above are used to define the relative order among
> handlers.  For instance, the 100 for SHP_ACPI_BUS_DEL_EXECUTE_ORDER can
> potentially be 21 since it is still larger than 20 for
> SHP_MEM_DEL_EXECUTE_ORDER defined in linux/sys_hotplug.h.  I picked 100
> so that more platform-neutral handlers can be added in between 20 and
> 100 in future.

I thought so, but I don't think it's a good idea to add gaps like this.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-01-14 Thread Rafael J. Wysocki
On Monday, January 14, 2013 08:33:48 AM Toshi Kani wrote:
> On Fri, 2013-01-11 at 22:23 +0100, Rafael J. Wysocki wrote:
> > On Thursday, January 10, 2013 04:40:19 PM Toshi Kani wrote:
> > > Added include/linux/sys_hotplug.h, which defines the system device
> > > hotplug framework interfaces used by the framework itself and
> > > handlers.
> > > 
> > > The order values define the calling sequence of handlers.  For add
> > > execute, the ordering is ACPI->MEM->CPU.  Memory is onlined before
> > > CPU so that threads on new CPUs can start using their local memory.
> > > The ordering of the delete execute is symmetric to the add execute.
> > > 
> > > struct shp_request defines a hot-plug request information.  The
> > > device resource information is managed with a list so that a single
> > > request may target to multiple devices.
> > > 
>  :
> > > +
> > > +struct shp_device {
> > > + struct list_headlist;
> > > + struct device   *device;
> > > + enum shp_class  class;
> > > + union shp_dev_info  info;
> > > +};
> > > +
> > > +/*
> > > + * Hot-plug request
> > > + */
> > > +struct shp_request {
> > > + /* common info */
> > > + enum shp_operation  operation;  /* operation */
> > > +
> > > + /* hot-plug event info: only valid for hot-plug operations */
> > > + void*handle;/* FW handle */
> > 
> > What's the role of handle here?
> 
> On ACPI-based platforms, the handle keeps a notified ACPI handle when a
> hot-plug request is made.  ACPI bus handlers, acpi_add_execute() /
> acpi_del_execute(), then scans / trims ACPI devices from the handle.

OK, so this is ACPI-specific and should be described as such.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 02/12] ACPI: Add sys_hotplug.h for system device hotplug framework

2013-01-14 Thread Rafael J. Wysocki
On Monday, January 14, 2013 11:42:09 AM Toshi Kani wrote:
> On Mon, 2013-01-14 at 19:47 +0100, Rafael J. Wysocki wrote:
> > On Monday, January 14, 2013 08:53:53 AM Toshi Kani wrote:
> > > On Fri, 2013-01-11 at 22:25 +0100, Rafael J. Wysocki wrote:
> > > > On Thursday, January 10, 2013 04:40:20 PM Toshi Kani wrote:
> > > > > Added include/acpi/sys_hotplug.h, which is ACPI-specific system
> > > > > device hotplug header and defines the order values of ACPI-specific
> > > > > handlers.
> > > > > 
> > > > > Signed-off-by: Toshi Kani 
> > > > > ---
> > > > >  include/acpi/sys_hotplug.h |   48 
> > > > > 
> > > > >  1 file changed, 48 insertions(+)
> > > > >  create mode 100644 include/acpi/sys_hotplug.h
> > > > > 
> > > > > diff --git a/include/acpi/sys_hotplug.h b/include/acpi/sys_hotplug.h
> > > > > new file mode 100644
> > > > > index 000..ad80f61
> > > > > --- /dev/null
> > > > > +++ b/include/acpi/sys_hotplug.h
> > > > > @@ -0,0 +1,48 @@
> > > > > +/*
> > > > > + * sys_hotplug.h - ACPI System device hot-plug framework
> > > > > + *
> > > > > + * Copyright (C) 2012 Hewlett-Packard Development Company, L.P.
> > > > > + *   Toshi Kani 
> > > > > + *
> > > > > + * This program is free software; you can redistribute it and/or 
> > > > > modify
> > > > > + * it under the terms of the GNU General Public License version 2 as
> > > > > + * published by the Free Software Foundation.
> > > > > + */
> > > > > +
> > > > > +#ifndef _ACPI_SYS_HOTPLUG_H
> > > > > +#define _ACPI_SYS_HOTPLUG_H
> > > > > +
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +
> > > > > +/*
> > > > > + * System device hot-plug operation proceeds in the following order.
> > > > > + *   Validate phase -> Execute phase -> Commit phase
> > > > > + *
> > > > > + * The order values below define the calling sequence of 
> > > > > ACPI-specific
> > > > > + * handlers for each phase in ascending order.  The order value of
> > > > > + * platform-neutral handlers are defined in .
> > > > > + */
> > > > > +
> > > > > +/* Add Validate order values */
> > > > > +#define SHP_ACPI_BUS_ADD_VALIDATE_ORDER  0   /* must 
> > > > > be first */
> > > > > +
> > > > > +/* Add Execute order values */
> > > > > +#define SHP_ACPI_BUS_ADD_EXECUTE_ORDER   10
> > > > > +#define SHP_ACPI_RES_ADD_EXECUTE_ORDER   20
> > > > > +
> > > > > +/* Add Commit order values */
> > > > > +#define SHP_ACPI_BUS_ADD_COMMIT_ORDER10
> > > > > +
> > > > > +/* Delete Validate order values */
> > > > > +#define SHP_ACPI_BUS_DEL_VALIDATE_ORDER  0   /* must 
> > > > > be first */
> > > > > +#define SHP_ACPI_RES_DEL_VALIDATE_ORDER  10
> > > > > +
> > > > > +/* Delete Execute order values */
> > > > > +#define SHP_ACPI_BUS_DEL_EXECUTE_ORDER   100
> > > > > +
> > > > > +/* Delete Commit order values */
> > > > > +#define SHP_ACPI_BUS_DEL_COMMIT_ORDER100
> > > > > +
> > > > > +#endif   /* _ACPI_SYS_HOTPLUG_H */
> > > > > --
> > > > 
> > > > Why did you use the particular values above?
> > > 
> > > The ordering values above are used to define the relative order among
> > > handlers.  For instance, the 100 for SHP_ACPI_BUS_DEL_EXECUTE_ORDER can
> > > potentially be 21 since it is still larger than 20 for
> > > SHP_MEM_DEL_EXECUTE_ORDER defined in linux/sys_hotplug.h.  I picked 100
> > > so that more platform-neutral handlers can be added in between 20 and
> > > 100 in future.
> > 
> > I thought so, but I don't think it's a good idea to add gaps like this.
> 
> OK, I will use an equal gap of 10 for all values.  So, the 100 in the
> above example will be changed to 30.  

I wonder why you want to have those gaps at all.

Anyway, this is just a small detail and it doesn't mean I don't have more
comments.  I just need some more time to get the big picture idea of how this
is supposed to work and perhaps Greg will have some remarks too.

Thanks,
Rafael



-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 00/12] System device hot-plug framework

2013-01-16 Thread Rafael J. Wysocki
 that the ordering of hot-plug operations needs to be
independent of the device herarchy ordering?

(5) Why do you think it's a good idea to require every callback routine to
browse the entire list of devices attached to the request object?  Wouldn't
it be more convenient if they were called only for the types of devices
they have declared to handle?  [That would reduce some code duplication,
for example.]

(6) Why is it convenient to use order values (priorities) of notifiers to
indicate both the ordering with respect to the other notifiers and the
"level" (e.g. whether or not rollback is possible) at the same time?  Those
things appear to be conceptually distinct.

(7) Why callbacks used for "add" operations still need to check if the
operation type is "add" (cpu_add_execute() does that for example)?

(8) What problems *exactly* this is supposed to address?  Can you give a few
examples, please?

I guess I'll have more questions going forward.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-01-31 Thread Rafael J. Wysocki
On Wednesday, January 30, 2013 07:57:45 PM Toshi Kani wrote:
> On Tue, 2013-01-29 at 23:58 -0500, Greg KH wrote:
> > On Thu, Jan 10, 2013 at 04:40:19PM -0700, Toshi Kani wrote:
> > > +/*
> > > + * Hot-plug device information
> > > + */
> > 
> > Again, stop it with the "generic" hotplug term here, and everywhere
> > else.  You are doing a very _specific_ type of hotplug devices, so spell
> > it out.  We've worked hard to hotplug _everything_ in Linux, you are
> > going to confuse a lot of people with this type of terms.
> 
> Agreed.  I will clarify in all places.
> 
> > > +union shp_dev_info {
> > > + struct shp_cpu {
> > > + u32 cpu_id;
> > > + } cpu;
> > 
> > What is this?  Why not point to the system device for the cpu?
> 
> This info is used to on-line a new CPU and create its system/cpu device.
> In other word, a system/cpu device is created as a result of CPU
> hotplug.
> 
> > > + struct shp_memory {
> > > + int node;
> > > + u64 start_addr;
> > > + u64 length;
> > > + } mem;
> > 
> > Same here, why not point to the system device?
> 
> Same as above.
> 
> > > + struct shp_hostbridge {
> > > + } hb;
> > > +
> > > + struct shp_node {
> > > + } node;
> > 
> > What happened here with these?  Empty structures?  Huh?
> 
> They are place holders for now.  PCI bridge hot-plug and node hot-plug
> are still very much work in progress, so I have not integrated them into
> this framework yet.
> 
> > > +};
> > > +
> > > +struct shp_device {
> > > + struct list_headlist;
> > > + struct device   *device;
> > 
> > No, make it a "real" device, embed the device into it.
> 
> This device pointer is used to send KOBJ_ONLINE/OFFLINE event during CPU
> online/offline operation in order to maintain the current behavior.  CPU
> online/offline operation only changes the state of CPU, so its
> system/cpu device continues to be present before and after an operation.
> (Whereas, CPU hot-add/delete operation creates or removes a system/cpu
> device.)  So, this "*device" needs to be a pointer to reference an
> existing device that is to be on-lined/off-lined.
> 
> > But, again, I'm going to ask why you aren't using the existing cpu /
> > memory / bridge / node devices that we have in the kernel.  Please use
> > them, or give me a _really_ good reason why they will not work.
> 
> We cannot use the existing system devices or ACPI devices here.  During
> hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> device information in a platform-neutral way.  During hot-add, we first
> creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> but platform-neutral modules cannot use them as they are ACPI-specific.

But suppose we're smart and have ACPI scan handlers that will create
"physical" device nodes for those devices during the ACPI namespace scan.
Then, the platform-neutral nodes will be able to bind to those "physical"
nodes.  Moreover, it should be possible to get a hierarchy of device objects
this way that will reflect all of the dependencies we need to take into
account during hot-add and hot-remove operations.  That may not be what we
have today, but I don't see any *fundamental* obstacles preventing us from
using this approach.

This is already done for PCI host bridges and platform devices and I don't
see why we can't do that for the other types of devices too.

The only missing piece I see is a way to handle the "eject" problem, i.e.
when we try do eject a device at the top of a subtree and need to tear down
the entire subtree below it, but if that's going to lead to a system crash,
for example, we want to cancel the eject.  It seems to me that we'll need some
help from the driver core here.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-01 Thread Rafael J. Wysocki
On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > memory / bridge / node devices that we have in the kernel.  Please use
> > > > them, or give me a _really_ good reason why they will not work.
> > > 
> > > We cannot use the existing system devices or ACPI devices here.  During
> > > hot-plug, ACPI handler sets this shp_device info, so that cpu and memory
> > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their target
> > > device information in a platform-neutral way.  During hot-add, we first
> > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > but platform-neutral modules cannot use them as they are ACPI-specific.
> > 
> > But suppose we're smart and have ACPI scan handlers that will create
> > "physical" device nodes for those devices during the ACPI namespace scan.
> > Then, the platform-neutral nodes will be able to bind to those "physical"
> > nodes.  Moreover, it should be possible to get a hierarchy of device objects
> > this way that will reflect all of the dependencies we need to take into
> > account during hot-add and hot-remove operations.  That may not be what we
> > have today, but I don't see any *fundamental* obstacles preventing us from
> > using this approach.
> 
> I would _much_ rather see that be the solution here as I think it is the
> proper one.
> 
> > This is already done for PCI host bridges and platform devices and I don't
> > see why we can't do that for the other types of devices too.
> 
> I agree.
> 
> > The only missing piece I see is a way to handle the "eject" problem, i.e.
> > when we try do eject a device at the top of a subtree and need to tear down
> > the entire subtree below it, but if that's going to lead to a system crash,
> > for example, we want to cancel the eject.  It seems to me that we'll need 
> > some
> > help from the driver core here.
> 
> I say do what we always have done here, if the user asked us to tear
> something down, let it happen as they are the ones that know best :)
> 
> Seriously, I guess this gets back to the "fail disconnect" idea that the
> ACPI developers keep harping on.  I thought we already resolved this
> properly by having them implement it in their bus code, no reason the
> same thing couldn't happen here, right?

Not really. :-)  We haven't ever resolved that particular issue I'm afraid.

> I don't think the core needs to do anything special, but if so, I'll be glad
> to review it.

OK, so this is the use case.  We have "eject" defined for something like
a container with a number of CPU cores, PCI host bridge, and a memory
controller under it.  And a few pretty much arbitrary I/O devices as a bonus.

Now, there's a button on the system case labeled as "Eject" and if that button
is pressed, we're supposed to _try_ to eject all of those things at once.  We
are allowed to fail that request, though, if that's problematic for some
reason, but we're supposed to let the BIOS know about that.

Do you seriously think that if that button is pressed, we should just proceed
with removing all that stuff no matter what?  That'd be kind of like Russian
roulette for whoever pressed that button, because s/he could only press it and
wait for the system to either crash or not.  Or maybe to crash a bit later
because of some delayed stuff that would hit one of those devices that had just
gone.  Surely not a situation any admin of a high-availability system would
like to be in. :-)

Quite frankly, I have no idea how that can be addressed in a single bus type,
let alone ACPI (which is not even a proper bus type, just something pretending
to be one).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-01 Thread Rafael J. Wysocki
Removing and adding devices and handling hotplug operations is what the
> > driver core was written for, almost 10 years ago.  To somehow think that
> > your devices are "special" just because they don't use ACPI is odd,
> > because the driver core itself has nothing to do with ACPI.  Don't get
> > the current mix of x86 system code tied into ACPI confused with an
> > driver core issues here please.
> 
> CPU online/offline operation is performed within the CPU module.  Memory
> online/offline operation is performed within the memory module.  CPU and
> memory hotplug operations are performed within ACPI.  While they deal
> with the same set of devices, they operate independently and are not
> managed under a same framework.
> 
> I agree with you that not using ACPI is perfectly fine.  My point is
> that ACPI framework won't be able to manage operations that do not use
> ACPI.
> 
> > > 5. Why can't do everything with ACPI bus scan?
> > > Software dependency among system devices may not be dictated by the ACPI
> > > hierarchy.  For instance, memory should be initialized before CPUs (i.e.
> > > a new cpu may need its local memory), but such ordering cannot be
> > > guaranteed by the ACPI hierarchy.  Also, as described in 4,
> > > online/offline operations are independent from ACPI.  
> > 
> > That's fine, the driver core is independant from ACPI.  I don't care how
> > you do the scaning of your devices, but I do care about you creating new
> > driver core pieces that duplicate the existing functionality of what we
> > have today.
> >
> > In short, I like Rafael's proposal better, and I fail to see how
> > anything you have stated here would matter in how this is implemented. :)
> 
> Doing everything within ACPI means we can only manage ACPI hotplug
> operations, not online/offline operations.  But I understand that you
> concern about adding a new framework with option C.  It is good to know
> that you are fine with option B. :)  So, I will step back, and think
> about what we can do within ACPI.

Not much, because ACPI only knows about a subset of devices that may be
involved in that, and a limited one for that matter.  For one example,
anything connected through PCI and not having a corresponding ACPI object (i.e.
pretty much every add-in card in existence) will be unknown to ACPI.  And
say one of these things is a SATA controller with a number of disks under it
and so on.  ACPI won't even know that it exists.  Moreover, PCI won't know
that those disks exist.  Etc.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-02 Thread Rafael J. Wysocki
On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > But, again, I'm going to ask why you aren't using the existing cpu /
> > > > > > memory / bridge / node devices that we have in the kernel.  Please 
> > > > > > use
> > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > 
> > > > > We cannot use the existing system devices or ACPI devices here.  
> > > > > During
> > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and 
> > > > > memory
> > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their 
> > > > > target
> > > > > device information in a platform-neutral way.  During hot-add, we 
> > > > > first
> > > > > creates an ACPI device node (i.e. device under /sys/bus/acpi/devices),
> > > > > but platform-neutral modules cannot use them as they are 
> > > > > ACPI-specific.
> > > > 
> > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > "physical" device nodes for those devices during the ACPI namespace 
> > > > scan.
> > > > Then, the platform-neutral nodes will be able to bind to those 
> > > > "physical"
> > > > nodes.  Moreover, it should be possible to get a hierarchy of device 
> > > > objects
> > > > this way that will reflect all of the dependencies we need to take into
> > > > account during hot-add and hot-remove operations.  That may not be what 
> > > > we
> > > > have today, but I don't see any *fundamental* obstacles preventing us 
> > > > from
> > > > using this approach.
> > > 
> > > I would _much_ rather see that be the solution here as I think it is the
> > > proper one.
> > > 
> > > > This is already done for PCI host bridges and platform devices and I 
> > > > don't
> > > > see why we can't do that for the other types of devices too.
> > > 
> > > I agree.
> > > 
> > > > The only missing piece I see is a way to handle the "eject" problem, 
> > > > i.e.
> > > > when we try do eject a device at the top of a subtree and need to tear 
> > > > down
> > > > the entire subtree below it, but if that's going to lead to a system 
> > > > crash,
> > > > for example, we want to cancel the eject.  It seems to me that we'll 
> > > > need some
> > > > help from the driver core here.
> > > 
> > > I say do what we always have done here, if the user asked us to tear
> > > something down, let it happen as they are the ones that know best :)
> > > 
> > > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > > ACPI developers keep harping on.  I thought we already resolved this
> > > properly by having them implement it in their bus code, no reason the
> > > same thing couldn't happen here, right?
> > 
> > Not really. :-)  We haven't ever resolved that particular issue I'm afraid.
> 
> Ah, I didn't realize that.
> 
> > > I don't think the core needs to do anything special, but if so, I'll be 
> > > glad
> > > to review it.
> > 
> > OK, so this is the use case.  We have "eject" defined for something like
> > a container with a number of CPU cores, PCI host bridge, and a memory
> > controller under it.  And a few pretty much arbitrary I/O devices as a 
> > bonus.
> > 
> > Now, there's a button on the system case labeled as "Eject" and if that 
> > button
> > is pressed, we're supposed to _try_ to eject all of those things at once.  
> > We
> > are allowed to fail that request, though, if that's problematic for some
> > reason, but we're supposed to let the BIOS know about that.
> > 
> > Do you seriously think that if that button is pressed, we should just 
> > proceed
> > with removing all that stuff no matter what?  That'd be kind of like Russian
> > roulette for whoever pressed that button, because s/he could only press it 
> > and
> > wait for the system to

[PATCH?] Move ACPI device nodes under /sys/firmware/acpi (was: Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework)

2013-02-02 Thread Rafael J. Wysocki
On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
[...]
> 
> > I know it's more complicated with these types of devices, and I think we
> > are getting closer to the correct solution, I just don't want to ever
> > see duplicate devices in the driver model for the same physical device.
> 
> Do you mean two things based on struct device for the same hardware component?
> That's been happening already pretty much forever for every PCI device known
> to the ACPI layer, for PNP and many others.  However, those ACPI things are 
> (or
> rather should be, but we're going to clean that up) only for convenience (to 
> be
> able to see the namespace structure and related things in sysfs).  So the 
> stuff
> under /sys/devices/LNXSYSTM\:00/ is not "real".  In my view it shouldn't even
> be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place for 
> it),
> but that may be difficult to change without breaking user space (maybe we can
> just symlink it from /sys/devices/ or something).  And the ACPI bus type
> shouldn't even exist in my opinion.

Well, well.

In fact, the appended patch moves the whole ACPI device nodes tree under
/sys/firmware/acpi/ and I'm not seeing any negative consequences of that on my
test box (events work and so on).  User space is quite new on it, though, and
the patch is hackish.

Still ...


---
Prototype, no sign-off
---
 drivers/acpi/scan.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-pm/drivers/acpi/scan.c
===
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1443,6 +1443,8 @@ void acpi_init_device_object(struct acpi
device->flags.match_driver = false;
device_initialize(&device->dev);
dev_set_uevent_suppress(&device->dev, true);
+   if (handle == ACPI_ROOT_OBJECT)
+   device->dev.kobj.parent = acpi_kobj;
 }
 
 void acpi_device_add_finalize(struct acpi_device *device)



-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-03 Thread Rafael J. Wysocki
On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > > But, again, I'm going to ask why you aren't using the existing 
> > > > > > > cpu /
> > > > > > > memory / bridge / node devices that we have in the kernel.  
> > > > > > > Please use
> > > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > > 
> > > > > > We cannot use the existing system devices or ACPI devices here.  
> > > > > > During
> > > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and 
> > > > > > memory
> > > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their 
> > > > > > target
> > > > > > device information in a platform-neutral way.  During hot-add, we 
> > > > > > first
> > > > > > creates an ACPI device node (i.e. device under 
> > > > > > /sys/bus/acpi/devices),
> > > > > > but platform-neutral modules cannot use them as they are 
> > > > > > ACPI-specific.
> > > > > 
> > > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > > "physical" device nodes for those devices during the ACPI namespace 
> > > > > scan.
> > > > > Then, the platform-neutral nodes will be able to bind to those 
> > > > > "physical"
> > > > > nodes.  Moreover, it should be possible to get a hierarchy of device 
> > > > > objects
> > > > > this way that will reflect all of the dependencies we need to take 
> > > > > into
> > > > > account during hot-add and hot-remove operations.  That may not be 
> > > > > what we
> > > > > have today, but I don't see any *fundamental* obstacles preventing us 
> > > > > from
> > > > > using this approach.
> > > > 
> > > > I would _much_ rather see that be the solution here as I think it is the
> > > > proper one.
> > > > 
> > > > > This is already done for PCI host bridges and platform devices and I 
> > > > > don't
> > > > > see why we can't do that for the other types of devices too.
> > > > 
> > > > I agree.
> > > > 
> > > > > The only missing piece I see is a way to handle the "eject" problem, 
> > > > > i.e.
> > > > > when we try do eject a device at the top of a subtree and need to 
> > > > > tear down
> > > > > the entire subtree below it, but if that's going to lead to a system 
> > > > > crash,
> > > > > for example, we want to cancel the eject.  It seems to me that we'll 
> > > > > need some
> > > > > help from the driver core here.
> > > > 
> > > > I say do what we always have done here, if the user asked us to tear
> > > > something down, let it happen as they are the ones that know best :)
> > > > 
> > > > Seriously, I guess this gets back to the "fail disconnect" idea that the
> > > > ACPI developers keep harping on.  I thought we already resolved this
> > > > properly by having them implement it in their bus code, no reason the
> > > > same thing couldn't happen here, right?
> > > 
> > > Not really. :-)  We haven't ever resolved that particular issue I'm 
> > > afraid.
> > 
> > Ah, I didn't realize that.
> > 
> > > > I don't think the core needs to do anything special, but if so, I'll be 
> > > > glad
> > > > to review it.
> > > 
> > > OK, so this is the use case.  We have "eject" defined for something like
> > > a container with a number of CPU cores, PCI host bridge, and a memory
> > > controller under it.  And a few pretty much arbitrary I/O devices as a 
> > > bonus.
> > > 
> > > Now, there's a button on the system case labeled as "Eject" and if that 
> > > button
> > > is pressed, we're

Re: [PATCH?] Move ACPI device nodes under /sys/firmware/acpi (was: Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework)

2013-02-04 Thread Rafael J. Wysocki
On Sunday, February 03, 2013 07:24:47 PM Greg KH wrote:
> On Sat, Feb 02, 2013 at 11:18:20PM +0100, Rafael J. Wysocki wrote:
> > On Saturday, February 02, 2013 09:15:37 PM Rafael J. Wysocki wrote:
> > > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > [...]
> > > 
> > > > I know it's more complicated with these types of devices, and I think we
> > > > are getting closer to the correct solution, I just don't want to ever
> > > > see duplicate devices in the driver model for the same physical device.
> > > 
> > > Do you mean two things based on struct device for the same hardware 
> > > component?
> > > That's been happening already pretty much forever for every PCI device 
> > > known
> > > to the ACPI layer, for PNP and many others.  However, those ACPI things 
> > > are (or
> > > rather should be, but we're going to clean that up) only for convenience 
> > > (to be
> > > able to see the namespace structure and related things in sysfs).  So the 
> > > stuff
> > > under /sys/devices/LNXSYSTM\:00/ is not "real".  In my view it shouldn't 
> > > even
> > > be under /sys/devices/ (/sys/firmware/acpi/ seems to be a better place 
> > > for it),
> > > but that may be difficult to change without breaking user space (maybe we 
> > > can
> > > just symlink it from /sys/devices/ or something).  And the ACPI bus type
> > > shouldn't even exist in my opinion.
> > 
> > Well, well.
> > 
> > In fact, the appended patch moves the whole ACPI device nodes tree under
> > /sys/firmware/acpi/ and I'm not seeing any negative consequences of that on 
> > my
> > test box (events work and so on).  User space is quite new on it, though, 
> > and
> > the patch is hackish.
> 
> Try booting a RHEL 5 image on this type of kernel, or some old Fedora
> releases, they were sensitive to changes in sysfs.

Well, I've found a machine where it causes problems to happen.

I'll try to add a symlink from /sys/devices to that and see what happens then.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Sunday, February 03, 2013 07:23:49 PM Greg KH wrote:
> On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > > On Fri, Feb 01, 2013 at 11:12:59PM +0100, Rafael J. Wysocki wrote:
> > > > On Friday, February 01, 2013 08:23:12 AM Greg KH wrote:
> > > > > On Thu, Jan 31, 2013 at 09:54:51PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > But, again, I'm going to ask why you aren't using the existing 
> > > > > > > > cpu /
> > > > > > > > memory / bridge / node devices that we have in the kernel.  
> > > > > > > > Please use
> > > > > > > > them, or give me a _really_ good reason why they will not work.
> > > > > > > 
> > > > > > > We cannot use the existing system devices or ACPI devices here.  
> > > > > > > During
> > > > > > > hot-plug, ACPI handler sets this shp_device info, so that cpu and 
> > > > > > > memory
> > > > > > > handlers (drivers/cpu.c and mm/memory_hotplug.c) can obtain their 
> > > > > > > target
> > > > > > > device information in a platform-neutral way.  During hot-add, we 
> > > > > > > first
> > > > > > > creates an ACPI device node (i.e. device under 
> > > > > > > /sys/bus/acpi/devices),
> > > > > > > but platform-neutral modules cannot use them as they are 
> > > > > > > ACPI-specific.
> > > > > > 
> > > > > > But suppose we're smart and have ACPI scan handlers that will create
> > > > > > "physical" device nodes for those devices during the ACPI namespace 
> > > > > > scan.
> > > > > > Then, the platform-neutral nodes will be able to bind to those 
> > > > > > "physical"
> > > > > > nodes.  Moreover, it should be possible to get a hierarchy of 
> > > > > > device objects
> > > > > > this way that will reflect all of the dependencies we need to take 
> > > > > > into
> > > > > > account during hot-add and hot-remove operations.  That may not be 
> > > > > > what we
> > > > > > have today, but I don't see any *fundamental* obstacles preventing 
> > > > > > us from
> > > > > > using this approach.
> > > > > 
> > > > > I would _much_ rather see that be the solution here as I think it is 
> > > > > the
> > > > > proper one.
> > > > > 
> > > > > > This is already done for PCI host bridges and platform devices and 
> > > > > > I don't
> > > > > > see why we can't do that for the other types of devices too.
> > > > > 
> > > > > I agree.
> > > > > 
> > > > > > The only missing piece I see is a way to handle the "eject" 
> > > > > > problem, i.e.
> > > > > > when we try do eject a device at the top of a subtree and need to 
> > > > > > tear down
> > > > > > the entire subtree below it, but if that's going to lead to a 
> > > > > > system crash,
> > > > > > for example, we want to cancel the eject.  It seems to me that 
> > > > > > we'll need some
> > > > > > help from the driver core here.
> > > > > 
> > > > > I say do what we always have done here, if the user asked us to tear
> > > > > something down, let it happen as they are the ones that know best :)
> > > > > 
> > > > > Seriously, I guess this gets back to the "fail disconnect" idea that 
> > > > > the
> > > > > ACPI developers keep harping on.  I thought we already resolved this
> > > > > properly by having them implement it in their bus code, no reason the
> > > > > same thing couldn't happen here, right?
> > > > 
> > > > Not really. :-)  We haven't ever resolved that particular issue I'm 
> > > > afraid.
> > > 
> > > Ah, I didn't realize that.
> > > 
> > > > > I don't think the core needs to do anything special, but if so, I'll 
> > > > > be glad
> > > > > to review i

Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > Yes, but those are just remove events and we can only see how destructive 
> > > they
> > > were after the removal.  The point is to be able to figure out whether or 
> > > not
> > > we *want* to do the removal in the first place.
> > > 
> > > Say you have a computing node which signals a hardware problem in a 
> > > processor
> > > package (the container with CPU cores, memory, PCI host bridge etc.).  You
> > > may want to eject that package, but you don't want to kill the system this
> > > way.  So if the eject is doable, it is very much desirable to do it, but 
> > > if it
> > > is not doable, you'd rather shut the box down and do the replacement 
> > > afterward.
> > > That may be costly, however (maybe weeks of computations), so it should be
> > > avoided if possible, but not at the expense of crashing the box if the 
> > > eject
> > > doesn't work out.
> > 
> > It seems to me that we could handle that with the help of a new flag, say
> > "no_eject", in struct device, a global mutex, and a function that will walk
> > the given subtree of the device hierarchy and check if "no_eject" is set for
> > any devices in there.  Plus a global "no_eject" switch, perhaps.
> 
> I think this will always be racy, or at worst, slow things down on
> normal device operations as you will always be having to grab this flag
> whenever you want to do something new.

I don't see why this particular scheme should be racy, at least I don't see any
obvious races in it (although I'm not that good at races detection in general,
admittedly).

Also, I don't expect that flag to be used for everything, just for things known
to seriously break if forcible eject is done.  That may be not precise enough,
so that's a matter of defining its purpose more precisely.

We can do something like that on the ACPI level (ie. introduce a no_eject flag
in struct acpi_device and provide an iterface for the layers above ACPI to
manipulate it) but then devices without ACPI namespace objects won't be
covered.  That may not be a big deal, though.

So say dev is about to be used for something incompatible with ejecting, so to
speak.  Then, one would do platform_lock_eject(dev), which would check if dev
has an ACPI handle and then take acpi_eject_lock (if so).  The return value of
platform_lock_eject(dev) would need to be checked to see if the device is not
gone.  If it returns success (0), one would do something to the device and
call platform_no_eject(dev) and then platform_unlock_eject(dev).

To clear no_eject one would just call platform_allow_to_eject(dev) that would
do all of the locking and clearing in one operation.

The ACPI eject side would be similar to the thing I described previously,
so it would (1) take acpi_eject_lock, (2) see if any struct acpi_device
involved has no_eject set and if not, then (3) do acpi_bus_trim(), (4)
carry out the eject and (5) release acpi_eject_lock.

Step (2) above might be optional, ie. if eject is forcible, we would just do
(3) etc. without (2).

The locking should prevent races from happening (and it should prevent two
ejects from happening at the same time too, which is not a bad thing by itself).

> See my comments earlier about pci hotplug and the design decisions there
> about "no eject" capabilities for why.

Well, I replied to that one too. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 09:19:09 AM Toshi Kani wrote:
> On Mon, 2013-02-04 at 15:21 +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> > > On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > > > Yes, but those are just remove events and we can only see how 
> > > > > destructive they
> > > > > were after the removal.  The point is to be able to figure out 
> > > > > whether or not
> > > > > we *want* to do the removal in the first place.
> > > > > 
> > > > > Say you have a computing node which signals a hardware problem in a 
> > > > > processor
> > > > > package (the container with CPU cores, memory, PCI host bridge etc.). 
> > > > >  You
> > > > > may want to eject that package, but you don't want to kill the system 
> > > > > this
> > > > > way.  So if the eject is doable, it is very much desirable to do it, 
> > > > > but if it
> > > > > is not doable, you'd rather shut the box down and do the replacement 
> > > > > afterward.
> > > > > That may be costly, however (maybe weeks of computations), so it 
> > > > > should be
> > > > > avoided if possible, but not at the expense of crashing the box if 
> > > > > the eject
> > > > > doesn't work out.
> > > > 
> > > > It seems to me that we could handle that with the help of a new flag, 
> > > > say
> > > > "no_eject", in struct device, a global mutex, and a function that will 
> > > > walk
> > > > the given subtree of the device hierarchy and check if "no_eject" is 
> > > > set for
> > > > any devices in there.  Plus a global "no_eject" switch, perhaps.
> > > 
> > > I think this will always be racy, or at worst, slow things down on
> > > normal device operations as you will always be having to grab this flag
> > > whenever you want to do something new.
> > 
> > I don't see why this particular scheme should be racy, at least I don't see 
> > any
> > obvious races in it (although I'm not that good at races detection in 
> > general,
> > admittedly).
> > 
> > Also, I don't expect that flag to be used for everything, just for things 
> > known
> > to seriously break if forcible eject is done.  That may be not precise 
> > enough,
> > so that's a matter of defining its purpose more precisely.
> > 
> > We can do something like that on the ACPI level (ie. introduce a no_eject 
> > flag
> > in struct acpi_device and provide an iterface for the layers above ACPI to
> > manipulate it) but then devices without ACPI namespace objects won't be
> > covered.  That may not be a big deal, though.
> 
> I am afraid that bringing the device status management into the ACPI
> level would not a good idea.  acpi_device should only reflect ACPI
> device object information, not how its actual device is being used.
> 
> I like your initiative of acpi_scan_driver and I think scanning /
> trimming of ACPI object info is what the ACPI drivers should do.

ACPI drivers, yes, but the users of ACPI already rely on information
in struct acpi_device.  Like ACPI device power states, for example.

So platform_no_eject(dev) is not much different in that respect from
platform_pci_set_power_state(pci_dev).

The whole "eject" concept is somewhat ACPI-specific, though, and the eject
notifications come from ACPI, so I don't have a problem with limiting it to
ACPI-backed devices for the time being.

If it turns out the be useful outside of ACPI, then we can move it up to the
driver core.  For now I don't see a compelling reason to do that.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 09:46:18 AM Toshi Kani wrote:
> On Mon, 2013-02-04 at 04:46 -0800, Greg KH wrote:
> > On Sun, Feb 03, 2013 at 05:28:09PM -0700, Toshi Kani wrote:
> > > On Sat, 2013-02-02 at 16:01 +0100, Greg KH wrote:
> > > > On Fri, Feb 01, 2013 at 01:40:10PM -0700, Toshi Kani wrote:
> > > > > On Fri, 2013-02-01 at 07:30 +, Greg KH wrote:
> > > > > > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> > > > > >  > This is already done for PCI host bridges and platform devices 
> > > > > > and I don't
> > > > > > > > see why we can't do that for the other types of devices too.
> > > > > > > > 
> > > > > > > > The only missing piece I see is a way to handle the "eject" 
> > > > > > > > problem, i.e.
> > > > > > > > when we try do eject a device at the top of a subtree and need 
> > > > > > > > to tear down
> > > > > > > > the entire subtree below it, but if that's going to lead to a 
> > > > > > > > system crash,
> > > > > > > > for example, we want to cancel the eject.  It seems to me that 
> > > > > > > > we'll need some
> > > > > > > > help from the driver core here.
> > > > > > > 
> > > > > > > There are three different approaches suggested for system device
> > > > > > > hot-plug:
> > > > > > >  A. Proceed within system device bus scan.
> > > > > > >  B. Proceed within ACPI bus scan.
> > > > > > >  C. Proceed with a sequence (as a mini-boot).
> > > > > > > 
> > > > > > > Option A uses system devices as tokens, option B uses acpi 
> > > > > > > devices as
> > > > > > > tokens, and option C uses resource tables as tokens, for their 
> > > > > > > handlers.
> > > > > > > 
> > > > > > > Here is summary of key questions & answers so far.  I hope this
> > > > > > > clarifies why I am suggesting option 3.
> > > > > > > 
> > > > > > > 1. What are the system devices?
> > > > > > > System devices provide system-wide core computing resources, 
> > > > > > > which are
> > > > > > > essential to compose a computer system.  System devices are not
> > > > > > > connected to any particular standard buses.
> > > > > > 
> > > > > > Not a problem, lots of devices are not connected to any "particular
> > > > > > standard busses".  All this means is that system devices are 
> > > > > > connected
> > > > > > to the "system" bus, nothing more.
> > > > > 
> > > > > Can you give me a few examples of other devices that support hotplug 
> > > > > and
> > > > > are not connected to any particular buses?  I will investigate them to
> > > > > see how they are managed to support hotplug.
> > > > 
> > > > Any device that is attached to any bus in the driver model can be
> > > > hotunplugged from userspace by telling it to be "unbound" from the
> > > > driver controlling it.  Try it for any platform device in your system to
> > > > see how it happens.
> > > 
> > > The unbind operation, as I understand from you, is to detach a driver
> > > from a device.  Yes, unbinding can be done for any devices.  It is
> > > however different from hot-plug operation, which unplugs a device.
> > 
> > Physically, yes, but to the driver involved, and the driver core, there
> > is no difference.  That was one of the primary goals of the driver core
> > creation so many years ago.
> > 
> > > Today, the unbind operation to an ACPI cpu/memory devices causes
> > > hot-unplug (offline) operation to them, which is one of the major issues
> > > for us since unbind cannot fail.  This patchset addresses this issue by
> > > making the unbind operation of ACPI cpu/memory devices to do the
> > > unbinding only.  ACPI drivers no longer control cpu and memory as they
> > > are supposed to be controlled by their drivers, cpu and memory modules.
> > 
> > I think that's the problem right there, solve that, please.
> 
> We cannot eliminate the ACPI drivers since we have to scan ACPI.  But we
> can limit the ACPI drivers to do the scanning stuff only.   This is
> precisely the intend of this patchset.  The real stuff, removing actual
> devices, is done by the system device drivers/modules.

In case you haven't realized that yet, the $subject patchset has no future.

Let's just talk about how we can get what we need in more general terms.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 09:02:46 AM Toshi Kani wrote:
> On Mon, 2013-02-04 at 14:41 +0100, Rafael J. Wysocki wrote:
> > On Sunday, February 03, 2013 07:23:49 PM Greg KH wrote:
> > > On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> > > > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
>   :
> > > > Yes, but those are just remove events and we can only see how 
> > > > destructive they
> > > > were after the removal.  The point is to be able to figure out whether 
> > > > or not
> > > > we *want* to do the removal in the first place.
> > > 
> > > Yes, but, you will always race if you try to test to see if you can shut
> > > down a device and then trying to do it.  So walking the bus ahead of
> > > time isn't a good idea.
> > >
> > > And, we really don't have a viable way to recover if disconnect() fails,
> > > do we.  What do we do in that situation, restore the other devices we
> > > disconnected successfully?  How do we remember/know what they were?
> > > 
> > > PCI hotplug almost had this same problem until the designers finally
> > > realized that they just had to accept the fact that removing a PCI
> > > device could either happen by:
> > >   - a user yanking out the device, at which time the OS better
> > > clean up properly no matter what happens
> > >   - the user asked nicely to remove a device, and the OS can take
> > > as long as it wants to complete that action, including
> > > stalling for noticable amounts of time before eventually,
> > > always letting the action succeed.
> > > 
> > > I think the second thing is what you have to do here.  If a user tells
> > > the OS it wants to remove these devices, you better do it.  If you
> > > can't, because memory is being used by someone else, either move them
> > > off, or just hope that nothing bad happens, before the user gets
> > > frustrated and yanks out the CPU/memory module themselves physically :)
> > 
> > Well, that we can't help, but sometimes users really *want* the OS to tell 
> > them
> > if it is safe to unplug something at this particualr time (think about the
> > Windows' "safe remove" feature for USB sticks, for example; that came out of
> > users' demand AFAIR).
> > 
> > So in my opinion it would be good to give them an option to do "safe eject" 
> > or
> > "forcible eject", whichever they prefer.
> 
> For system device hot-plug, it always needs to be "safe eject".  This
> feature will be implemented on mission critical servers, which are
> managed by professional IT folks.  Crashing a server causes serious
> money to the business.

Well, "always" is a bit too strong a word as far as human behavior is concerned
in my opinion.

That said I would be perfectly fine with not supporting the "forcible eject" to
start with and waiting for the first request to add support for it.  I also
would be fine with taking bets on how much time it's going to take for such a
request to appear. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 06:33:52 AM Greg KH wrote:
> On Mon, Feb 04, 2013 at 03:21:22PM +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> > > On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > > > Yes, but those are just remove events and we can only see how 
> > > > > destructive they
> > > > > were after the removal.  The point is to be able to figure out 
> > > > > whether or not
> > > > > we *want* to do the removal in the first place.
> > > > > 
> > > > > Say you have a computing node which signals a hardware problem in a 
> > > > > processor
> > > > > package (the container with CPU cores, memory, PCI host bridge etc.). 
> > > > >  You
> > > > > may want to eject that package, but you don't want to kill the system 
> > > > > this
> > > > > way.  So if the eject is doable, it is very much desirable to do it, 
> > > > > but if it
> > > > > is not doable, you'd rather shut the box down and do the replacement 
> > > > > afterward.
> > > > > That may be costly, however (maybe weeks of computations), so it 
> > > > > should be
> > > > > avoided if possible, but not at the expense of crashing the box if 
> > > > > the eject
> > > > > doesn't work out.
> > > > 
> > > > It seems to me that we could handle that with the help of a new flag, 
> > > > say
> > > > "no_eject", in struct device, a global mutex, and a function that will 
> > > > walk
> > > > the given subtree of the device hierarchy and check if "no_eject" is 
> > > > set for
> > > > any devices in there.  Plus a global "no_eject" switch, perhaps.
> > > 
> > > I think this will always be racy, or at worst, slow things down on
> > > normal device operations as you will always be having to grab this flag
> > > whenever you want to do something new.
> > 
> > I don't see why this particular scheme should be racy, at least I don't see 
> > any
> > obvious races in it (although I'm not that good at races detection in 
> > general,
> > admittedly).
> > 
> > Also, I don't expect that flag to be used for everything, just for things 
> > known
> > to seriously break if forcible eject is done.  That may be not precise 
> > enough,
> > so that's a matter of defining its purpose more precisely.
> > 
> > We can do something like that on the ACPI level (ie. introduce a no_eject 
> > flag
> > in struct acpi_device and provide an iterface for the layers above ACPI to
> > manipulate it) but then devices without ACPI namespace objects won't be
> > covered.  That may not be a big deal, though.
> > 
> > So say dev is about to be used for something incompatible with ejecting, so 
> > to
> > speak.  Then, one would do platform_lock_eject(dev), which would check if 
> > dev
> > has an ACPI handle and then take acpi_eject_lock (if so).  The return value 
> > of
> > platform_lock_eject(dev) would need to be checked to see if the device is 
> > not
> > gone.  If it returns success (0), one would do something to the device and
> > call platform_no_eject(dev) and then platform_unlock_eject(dev).
> 
> How does a device "know" it is doing something that is incompatible with
> ejecting?  That's a non-trivial task from what I can tell.

I agree that this is complicated in general.  But.

There are devices known to have software "offline" and "online" operations
such that after the "offline" the given device is guaranteed to be not used
until "online".  We have that for CPU cores, for example, and user space can
do it via /sys/devices/system/cpu/cpuX/online .  So, why don't we make the
"online" set the no_eject flag (under the lock as appropriate) and the
"offline" clear it?  And why don't we define such "online" and "offline" for
all of the other "system" stuff, like memory, PCI host bridges etc. and make it
behave analogously?

Then, it is quite simple to say which devices should use the no_eject flag:
devices that have "online" and "offline" exported to user space.  And guess
who's responsible for "offlining" all of those things before trying to eject
them: user space is.  From the kernel's point of view it is all clear.  Hands
clean. :-)

Now, 

Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 12:46:24 PM Toshi Kani wrote:
> On Mon, 2013-02-04 at 20:48 +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 09:02:46 AM Toshi Kani wrote:
> > > On Mon, 2013-02-04 at 14:41 +0100, Rafael J. Wysocki wrote:
> > > > On Sunday, February 03, 2013 07:23:49 PM Greg KH wrote:
> > > > > On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> > > > > > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > >   :
> > > > > > Yes, but those are just remove events and we can only see how 
> > > > > > destructive they
> > > > > > were after the removal.  The point is to be able to figure out 
> > > > > > whether or not
> > > > > > we *want* to do the removal in the first place.
> > > > > 
> > > > > Yes, but, you will always race if you try to test to see if you can 
> > > > > shut
> > > > > down a device and then trying to do it.  So walking the bus ahead of
> > > > > time isn't a good idea.
> > > > >
> > > > > And, we really don't have a viable way to recover if disconnect() 
> > > > > fails,
> > > > > do we.  What do we do in that situation, restore the other devices we
> > > > > disconnected successfully?  How do we remember/know what they were?
> > > > > 
> > > > > PCI hotplug almost had this same problem until the designers finally
> > > > > realized that they just had to accept the fact that removing a PCI
> > > > > device could either happen by:
> > > > >   - a user yanking out the device, at which time the OS better
> > > > > clean up properly no matter what happens
> > > > >   - the user asked nicely to remove a device, and the OS can take
> > > > > as long as it wants to complete that action, including
> > > > > stalling for noticable amounts of time before eventually,
> > > > > always letting the action succeed.
> > > > > 
> > > > > I think the second thing is what you have to do here.  If a user tells
> > > > > the OS it wants to remove these devices, you better do it.  If you
> > > > > can't, because memory is being used by someone else, either move them
> > > > > off, or just hope that nothing bad happens, before the user gets
> > > > > frustrated and yanks out the CPU/memory module themselves physically 
> > > > > :)
> > > > 
> > > > Well, that we can't help, but sometimes users really *want* the OS to 
> > > > tell them
> > > > if it is safe to unplug something at this particualr time (think about 
> > > > the
> > > > Windows' "safe remove" feature for USB sticks, for example; that came 
> > > > out of
> > > > users' demand AFAIR).
> > > > 
> > > > So in my opinion it would be good to give them an option to do "safe 
> > > > eject" or
> > > > "forcible eject", whichever they prefer.
> > > 
> > > For system device hot-plug, it always needs to be "safe eject".  This
> > > feature will be implemented on mission critical servers, which are
> > > managed by professional IT folks.  Crashing a server causes serious
> > > money to the business.
> > 
> > Well, "always" is a bit too strong a word as far as human behavior is 
> > concerned
> > in my opinion.
> > 
> > That said I would be perfectly fine with not supporting the "forcible 
> > eject" to
> > start with and waiting for the first request to add support for it.  I also
> > would be fine with taking bets on how much time it's going to take for such 
> > a
> > request to appear. :-)
> 
> Sounds good.  In my experience, though, it actually takes a LONG time to
> convince customers that "safe eject" is actually safe.  Enterprise
> customers are so afraid of doing anything risky that might cause the
> system to crash or hang due to some defect.  I would be very surprised
> to see a customer asking for a force operation when we do not guarantee
> its outcome.  I have not seen such enterprise customers yet.

But we're talking about a kernel that is supposed to run on mobile phones too,
among other things.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 01:34:18 PM Toshi Kani wrote:
> On Mon, 2013-02-04 at 21:12 +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 12:46:24 PM Toshi Kani wrote:
> > > On Mon, 2013-02-04 at 20:48 +0100, Rafael J. Wysocki wrote:
> > > > On Monday, February 04, 2013 09:02:46 AM Toshi Kani wrote:
> > > > > On Mon, 2013-02-04 at 14:41 +0100, Rafael J. Wysocki wrote:
> > > > > > On Sunday, February 03, 2013 07:23:49 PM Greg KH wrote:
> > > > > > > On Sat, Feb 02, 2013 at 09:15:37PM +0100, Rafael J. Wysocki wrote:
> > > > > > > > On Saturday, February 02, 2013 03:58:01 PM Greg KH wrote:
> > > > >   :
> > > > > > > > Yes, but those are just remove events and we can only see how 
> > > > > > > > destructive they
> > > > > > > > were after the removal.  The point is to be able to figure out 
> > > > > > > > whether or not
> > > > > > > > we *want* to do the removal in the first place.
> > > > > > > 
> > > > > > > Yes, but, you will always race if you try to test to see if you 
> > > > > > > can shut
> > > > > > > down a device and then trying to do it.  So walking the bus ahead 
> > > > > > > of
> > > > > > > time isn't a good idea.
> > > > > > >
> > > > > > > And, we really don't have a viable way to recover if disconnect() 
> > > > > > > fails,
> > > > > > > do we.  What do we do in that situation, restore the other 
> > > > > > > devices we
> > > > > > > disconnected successfully?  How do we remember/know what they 
> > > > > > > were?
> > > > > > > 
> > > > > > > PCI hotplug almost had this same problem until the designers 
> > > > > > > finally
> > > > > > > realized that they just had to accept the fact that removing a PCI
> > > > > > > device could either happen by:
> > > > > > >   - a user yanking out the device, at which time the OS better
> > > > > > > clean up properly no matter what happens
> > > > > > >   - the user asked nicely to remove a device, and the OS can take
> > > > > > > as long as it wants to complete that action, including
> > > > > > > stalling for noticable amounts of time before eventually,
> > > > > > > always letting the action succeed.
> > > > > > > 
> > > > > > > I think the second thing is what you have to do here.  If a user 
> > > > > > > tells
> > > > > > > the OS it wants to remove these devices, you better do it.  If you
> > > > > > > can't, because memory is being used by someone else, either move 
> > > > > > > them
> > > > > > > off, or just hope that nothing bad happens, before the user gets
> > > > > > > frustrated and yanks out the CPU/memory module themselves 
> > > > > > > physically :)
> > > > > > 
> > > > > > Well, that we can't help, but sometimes users really *want* the OS 
> > > > > > to tell them
> > > > > > if it is safe to unplug something at this particualr time (think 
> > > > > > about the
> > > > > > Windows' "safe remove" feature for USB sticks, for example; that 
> > > > > > came out of
> > > > > > users' demand AFAIR).
> > > > > > 
> > > > > > So in my opinion it would be good to give them an option to do 
> > > > > > "safe eject" or
> > > > > > "forcible eject", whichever they prefer.
> > > > > 
> > > > > For system device hot-plug, it always needs to be "safe eject".  This
> > > > > feature will be implemented on mission critical servers, which are
> > > > > managed by professional IT folks.  Crashing a server causes serious
> > > > > money to the business.
> > > > 
> > > > Well, "always" is a bit too strong a word as far as human behavior is 
> > > > concerned
> > > > in my opinion.
> > > > 
> > > > That said I would be perfectly fine with not supporti

Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 01:59:27 PM Toshi Kani wrote:
> On Mon, 2013-02-04 at 20:45 +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 09:46:18 AM Toshi Kani wrote:
> > > On Mon, 2013-02-04 at 04:46 -0800, Greg KH wrote:
> > > > On Sun, Feb 03, 2013 at 05:28:09PM -0700, Toshi Kani wrote:
> > > > > On Sat, 2013-02-02 at 16:01 +0100, Greg KH wrote:
> > > > > > On Fri, Feb 01, 2013 at 01:40:10PM -0700, Toshi Kani wrote:
> > > > > > > On Fri, 2013-02-01 at 07:30 +, Greg KH wrote:
> > > > > > > > On Thu, Jan 31, 2013 at 06:32:18PM -0700, Toshi Kani wrote:
> > > > > > > >  > This is already done for PCI host bridges and platform 
> > > > > > > > devices and I don't
> > > > > > > > > > see why we can't do that for the other types of devices too.
> > > > > > > > > > 
> > > > > > > > > > The only missing piece I see is a way to handle the "eject" 
> > > > > > > > > > problem, i.e.
> > > > > > > > > > when we try do eject a device at the top of a subtree and 
> > > > > > > > > > need to tear down
> > > > > > > > > > the entire subtree below it, but if that's going to lead to 
> > > > > > > > > > a system crash,
> > > > > > > > > > for example, we want to cancel the eject.  It seems to me 
> > > > > > > > > > that we'll need some
> > > > > > > > > > help from the driver core here.
> > > > > > > > > 
> > > > > > > > > There are three different approaches suggested for system 
> > > > > > > > > device
> > > > > > > > > hot-plug:
> > > > > > > > >  A. Proceed within system device bus scan.
> > > > > > > > >  B. Proceed within ACPI bus scan.
> > > > > > > > >  C. Proceed with a sequence (as a mini-boot).
> > > > > > > > > 
> > > > > > > > > Option A uses system devices as tokens, option B uses acpi 
> > > > > > > > > devices as
> > > > > > > > > tokens, and option C uses resource tables as tokens, for 
> > > > > > > > > their handlers.
> > > > > > > > > 
> > > > > > > > > Here is summary of key questions & answers so far.  I hope 
> > > > > > > > > this
> > > > > > > > > clarifies why I am suggesting option 3.
> > > > > > > > > 
> > > > > > > > > 1. What are the system devices?
> > > > > > > > > System devices provide system-wide core computing resources, 
> > > > > > > > > which are
> > > > > > > > > essential to compose a computer system.  System devices are 
> > > > > > > > > not
> > > > > > > > > connected to any particular standard buses.
> > > > > > > > 
> > > > > > > > Not a problem, lots of devices are not connected to any 
> > > > > > > > "particular
> > > > > > > > standard busses".  All this means is that system devices are 
> > > > > > > > connected
> > > > > > > > to the "system" bus, nothing more.
> > > > > > > 
> > > > > > > Can you give me a few examples of other devices that support 
> > > > > > > hotplug and
> > > > > > > are not connected to any particular buses?  I will investigate 
> > > > > > > them to
> > > > > > > see how they are managed to support hotplug.
> > > > > > 
> > > > > > Any device that is attached to any bus in the driver model can be
> > > > > > hotunplugged from userspace by telling it to be "unbound" from the
> > > > > > driver controlling it.  Try it for any platform device in your 
> > > > > > system to
> > > > > > see how it happens.
> > > > > 
> > > > > The unbind operation, as I understand from you, is to detach a driver
> > > > > from a device.  Yes, unbinding can be done for any devi

Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 03:13:29 PM Toshi Kani wrote:
> On Mon, 2013-02-04 at 21:07 +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 06:33:52 AM Greg KH wrote:
> > > On Mon, Feb 04, 2013 at 03:21:22PM +0100, Rafael J. Wysocki wrote:
> > > > On Monday, February 04, 2013 04:48:10 AM Greg KH wrote:
> > > > > On Sun, Feb 03, 2013 at 09:44:39PM +0100, Rafael J. Wysocki wrote:
> > > > > > > Yes, but those are just remove events and we can only see how 
> > > > > > > destructive they
> > > > > > > were after the removal.  The point is to be able to figure out 
> > > > > > > whether or not
> > > > > > > we *want* to do the removal in the first place.
> > > > > > > 
> > > > > > > Say you have a computing node which signals a hardware problem in 
> > > > > > > a processor
> > > > > > > package (the container with CPU cores, memory, PCI host bridge 
> > > > > > > etc.).  You
> > > > > > > may want to eject that package, but you don't want to kill the 
> > > > > > > system this
> > > > > > > way.  So if the eject is doable, it is very much desirable to do 
> > > > > > > it, but if it
> > > > > > > is not doable, you'd rather shut the box down and do the 
> > > > > > > replacement afterward.
> > > > > > > That may be costly, however (maybe weeks of computations), so it 
> > > > > > > should be
> > > > > > > avoided if possible, but not at the expense of crashing the box 
> > > > > > > if the eject
> > > > > > > doesn't work out.
> > > > > > 
> > > > > > It seems to me that we could handle that with the help of a new 
> > > > > > flag, say
> > > > > > "no_eject", in struct device, a global mutex, and a function that 
> > > > > > will walk
> > > > > > the given subtree of the device hierarchy and check if "no_eject" 
> > > > > > is set for
> > > > > > any devices in there.  Plus a global "no_eject" switch, perhaps.
> > > > > 
> > > > > I think this will always be racy, or at worst, slow things down on
> > > > > normal device operations as you will always be having to grab this 
> > > > > flag
> > > > > whenever you want to do something new.
> > > > 
> > > > I don't see why this particular scheme should be racy, at least I don't 
> > > > see any
> > > > obvious races in it (although I'm not that good at races detection in 
> > > > general,
> > > > admittedly).
> > > > 
> > > > Also, I don't expect that flag to be used for everything, just for 
> > > > things known
> > > > to seriously break if forcible eject is done.  That may be not precise 
> > > > enough,
> > > > so that's a matter of defining its purpose more precisely.
> > > > 
> > > > We can do something like that on the ACPI level (ie. introduce a 
> > > > no_eject flag
> > > > in struct acpi_device and provide an iterface for the layers above ACPI 
> > > > to
> > > > manipulate it) but then devices without ACPI namespace objects won't be
> > > > covered.  That may not be a big deal, though.
> > > > 
> > > > So say dev is about to be used for something incompatible with 
> > > > ejecting, so to
> > > > speak.  Then, one would do platform_lock_eject(dev), which would check 
> > > > if dev
> > > > has an ACPI handle and then take acpi_eject_lock (if so).  The return 
> > > > value of
> > > > platform_lock_eject(dev) would need to be checked to see if the device 
> > > > is not
> > > > gone.  If it returns success (0), one would do something to the device 
> > > > and
> > > > call platform_no_eject(dev) and then platform_unlock_eject(dev).
> > > 
> > > How does a device "know" it is doing something that is incompatible with
> > > ejecting?  That's a non-trivial task from what I can tell.
> > 
> > I agree that this is complicated in general.  But.
> > 
> > There are devices known to have software "offline" and "online" operations

Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-04 Thread Rafael J. Wysocki
On Monday, February 04, 2013 04:04:47 PM Greg KH wrote:
> On Tue, Feb 05, 2013 at 12:52:30AM +0100, Rafael J. Wysocki wrote:
> > You'd probably never try to hot-remove a disk before unmounting filesystems
> > mounted from it or failing it as a RAID component and nobody sane wants the
> > kernel to do things like that automatically when the user presses the eject
> > button.  In my opinion we should treat memory eject, or CPU package eject, 
> > or
> > PCI host bridge eject in exactly the same way: Don't eject if it is not
> > prepared for ejecting in the first place.
> 
> Bad example, we have disks hot-removed all the time without any
> filesystems being unmounted, and have supported this since the 2.2 days
> (although we didn't get it "right" until 2.6.)

Well, that wasn't my point.

My point was that we have tools for unmounting filesystems from disks that
the user wants to hot-remove and the user is supposed to use those tools
before hot-removing the disks.  At least I wouldn't recommend anyone to
do otherwise. :-)

Now, for memory hot-removal we don't have anything like that, as far as I
can say, so my point was why don't we add memory "offline" that can be
done and tested separately from hot-removal and use that before we go and
hot-remove stuff?  And analogously for PCI host bridges etc.?

[Now, there's a question if an "eject" button on the system case, if there is
one, should *always* cause the eject to happen even though things are not
"offline".  My opinion is that not necessarily, because users may not be aware
that they are doing something wrong.

Quite analogously, does the power button always cause the system to shut down?
No.  So why the heck should an eject button always cause an eject to happen?
I see no reason.

That said, the most straightforward approach may be simply to let user space
disable eject events for specific devices when it wants and only enable them
when it knows that the given devices are ready for removal.

But I'm digressing.]

> PCI Host bridge eject is the same as PCI eject today, the user asks us
> to do it, and we can not fail it from happening.  We also can have them
> removed without us being told about it in the first place, and can
> properly clean up from it all.

Well, are you sure we'll always clean up?  I kind of have my doubts. :-)

> > And if you think about it, that makes things *massively* simpler, because 
> > now
> > the kernel doesn't heed to worry about all of those "synchronous removal"
> > scenarions that very well may involve every single device in the system and
> > the whole problem is nicely split into several separate "implement
> > offline/online" problems that are subsystem-specific and a single
> > "eject if everything relevant is offline" problem which is kind of trivial.
> > Plus the one of exposing information to user space, which is separate too.
> > 
> > Now, each of them can be worked on separately, *tested* separately and
> > debugged separately if need be and it is much easier to isolate failures
> > and so on.
> 
> So you are agreeing with me in that we can not fail hot removing any
> device, nice :)

That depends on how you define hot-removing.  If you regard the "offline"
as a separate operation that can be carried out independently and hot-remove
as the last step causing the device to actually go away, then I agree that
it can't fail.  The "offline" itself, however, is a different matter (pretty
much like unmounting a file system).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-05 Thread Rafael J. Wysocki
On Monday, February 04, 2013 04:04:47 PM Greg KH wrote:
> On Tue, Feb 05, 2013 at 12:52:30AM +0100, Rafael J. Wysocki wrote:
> > You'd probably never try to hot-remove a disk before unmounting filesystems
> > mounted from it or failing it as a RAID component and nobody sane wants the
> > kernel to do things like that automatically when the user presses the eject
> > button.  In my opinion we should treat memory eject, or CPU package eject, 
> > or
> > PCI host bridge eject in exactly the same way: Don't eject if it is not
> > prepared for ejecting in the first place.
> 
> Bad example, we have disks hot-removed all the time without any
> filesystems being unmounted, and have supported this since the 2.2 days
> (although we didn't get it "right" until 2.6.)

I actually don't think it is really bad, because it exposes the problem nicely.

Namely, there are two arguments that can be made here.  The first one is the
usability argument: Users should always be allowed to do what they want,
because it is [explicit content] annoying if software pretends to know better
what to do than the user (it is a convenience argument too, because usually
it's *easier* to allow users to do what they want).  The second one is the
data integrity argument: Operations that may lead to data loss should never
be carried out, because it is [explicit content] disappointing to lose valuable
stuff by a stupid mistake if software allows that mistake to be made (that also
may be costly in terms of real money).

You seem to believe that we should always follow the usability argument, while
Toshi seems to be thinking that (at least in the case of the "system" devices),
the data integrity argument is more important.  They are both valid arguments,
however, and they are in conflict, so this is a matter of balance.

You're saying that in the case of disks we always follow the usability argument
entirely.  I'm fine with that, although I suspect that some people may not be
considering this as the right balance.

Toshi seems to be thinking that for the hotplug of memory/CPUs/host bridges we
should always follow the data integrity argument entirely, because the users of
that feature value their data so much that they pretty much don't care about
usability.  That very well may be the case, so I'm fine with that too, although
I'm sure there are people who'll argue that this is not the right balance
either.

Now, the point is that we *can* do what Toshi is arguing for and that doesn't
seem to be overly complicated, so my question is: Why don't we do that, at
least to start with?  If it turns out eventually that the users care about
usability too, after all, we can add a switch to adjust things more to their
liking.  Still, we can very well do that later.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH v2 01/12] Add sys_hotplug.h for system device hotplug framework

2013-02-05 Thread Rafael J. Wysocki
On Tuesday, February 05, 2013 10:39:48 AM Greg KH wrote:
> On Tue, Feb 05, 2013 at 12:11:17PM +0100, Rafael J. Wysocki wrote:
> > On Monday, February 04, 2013 04:04:47 PM Greg KH wrote:
> > > On Tue, Feb 05, 2013 at 12:52:30AM +0100, Rafael J. Wysocki wrote:
> > > > You'd probably never try to hot-remove a disk before unmounting 
> > > > filesystems
> > > > mounted from it or failing it as a RAID component and nobody sane wants 
> > > > the
> > > > kernel to do things like that automatically when the user presses the 
> > > > eject
> > > > button.  In my opinion we should treat memory eject, or CPU package 
> > > > eject, or
> > > > PCI host bridge eject in exactly the same way: Don't eject if it is not
> > > > prepared for ejecting in the first place.
> > > 
> > > Bad example, we have disks hot-removed all the time without any
> > > filesystems being unmounted, and have supported this since the 2.2 days
> > > (although we didn't get it "right" until 2.6.)
> > 
> > I actually don't think it is really bad, because it exposes the problem 
> > nicely.
> > 
> > Namely, there are two arguments that can be made here.  The first one is the
> > usability argument: Users should always be allowed to do what they want,
> > because it is [explicit content] annoying if software pretends to know 
> > better
> > what to do than the user (it is a convenience argument too, because usually
> > it's *easier* to allow users to do what they want).  The second one is the
> > data integrity argument: Operations that may lead to data loss should never
> > be carried out, because it is [explicit content] disappointing to lose 
> > valuable
> > stuff by a stupid mistake if software allows that mistake to be made (that 
> > also
> > may be costly in terms of real money).
> > 
> > You seem to believe that we should always follow the usability argument, 
> > while
> > Toshi seems to be thinking that (at least in the case of the "system" 
> > devices),
> > the data integrity argument is more important.  They are both valid 
> > arguments,
> > however, and they are in conflict, so this is a matter of balance.
> > 
> > You're saying that in the case of disks we always follow the usability 
> > argument
> > entirely.  I'm fine with that, although I suspect that some people may not 
> > be
> > considering this as the right balance.
> > 
> > Toshi seems to be thinking that for the hotplug of memory/CPUs/host bridges 
> > we
> > should always follow the data integrity argument entirely, because the 
> > users of
> > that feature value their data so much that they pretty much don't care about
> > usability.  That very well may be the case, so I'm fine with that too, 
> > although
> > I'm sure there are people who'll argue that this is not the right balance
> > either.
> > 
> > Now, the point is that we *can* do what Toshi is arguing for and that 
> > doesn't
> > seem to be overly complicated, so my question is: Why don't we do that, at
> > least to start with?  If it turns out eventually that the users care about
> > usability too, after all, we can add a switch to adjust things more to their
> > liking.  Still, we can very well do that later.
> 
> Ok, I'd much rather deal with reviewing actual implementations than
> talking about theory at this point in time, so let's see what you all
> can come up with next and I'll be glad to review it.

Sure, thanks a lot for your comments so far!

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2] cpufreq: Notify all policy->cpus in cpufreq_notify_transition()

2013-03-24 Thread Rafael J. Wysocki
On Sunday, March 24, 2013 08:23:27 PM Viresh Kumar wrote:
> On 24 March 2013 20:07, Francesco Lavra  wrote:
> > On 03/24/2013 02:48 PM, Viresh Kumar wrote:
> >> policy->cpus contains all online cpus that have single shared clock line. 
> >> And
> >> their frequencies are always updated together.
> >>
> >> Many SMP system's cpufreq drivers take care of this in individual drivers 
> >> but
> >> the best place for this code is in cpufreq core.
> >>
> >> This patch modifies cpufreq_notify_transition() to notify frequency change 
> >> for
> >> all cpus in policy->cpus and hence updates all users of this API.
> 
> One thing about this work. I compiled it for ARM and Intel. Also this
> stuff is tested
> by "Fengguang Wu"  automated build system.
> 
> I am not sure if that builds all architectures or not.
> I tried to review my patch closely but their can be some minor mistakes.
> 
> I thought of adding this in the patch details but forgot at last.
> 
> Is their a simple way to compile stuff for all platforms? Sorry i am
> not aware of
> it :(
> 
> >> diff --git a/arch/blackfin/mach-common/cpufreq.c 
> >> b/arch/blackfin/mach-common/cpufreq.c
> >> + ret = cpu_set_cclk(policy->cpu, freqs.new * 1000);
> >> + if (ret != 0) {
> >> + WARN_ONCE(ret, "cpufreq set freq failed %d\n", ret);
> >> + break;
> >
> > This doesn't even compile, as the break statement isn't in the
> > for_each_online_cpu() loop anymore.
> 
> I tried to review it very carefully but this situation was a bit tricky :)
> Thanks for trying it out.
> 
> Following should fix it for you:
> 
> commit 942ca8a6bc87e3c42beabc9102755136493e5355
> Author: Viresh Kumar 
> Date:   Sun Mar 24 20:21:43 2013 +0530
> 
> fixup! cpufreq: Notify all policy->cpus in cpufreq_notify_transition()
> ---
>  arch/blackfin/mach-common/cpufreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/blackfin/mach-common/cpufreq.c
> b/arch/blackfin/mach-common/cpufreq.c
> index 4e67368..995511e80 100644
> --- a/arch/blackfin/mach-common/cpufreq.c
> +++ b/arch/blackfin/mach-common/cpufreq.c
> @@ -164,7 +164,7 @@ static int bfin_target(struct cpufreq_policy *policy,
> ret = cpu_set_cclk(policy->cpu, freqs.new * 1000);
> if (ret != 0) {
>     WARN_ONCE(ret, "cpufreq set freq failed %d\n", ret);
> -   break;
> +   return ret;
> }
>  #endif
> on_each_cpu(bfin_adjust_core_timer, &index, 1);
> 
> 
> @Rafael: Let me add fixups for now, i will send final patch later after others
> also review their part.

OK


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 9/9] powerpc: cpufreq: move cpufreq driver to drivers/cpufreq

2013-04-11 Thread Rafael J. Wysocki
On Thursday, April 11, 2013 12:32:50 PM Viresh Kumar wrote:
> On 11 April 2013 12:29, Olof Johansson  wrote:
> > That's not very nice this late in the staging cycle. Give the powerpc
> > guys a little more time than a single day to come back with test results,
> > please.
> 
> Yes we are waiting indefinitely for an Ack from powerpc guys.. If we can
> get a Ack soon, we will get it in 3.10 otherwise next cycle.

Well, I'll be traveling from Saturday morning onwards until the end of the
next week and my window for taking stuff into v3.10 closes in 16 hours.
Whatever is not ready (i.e. ACKed in this particular case) before that time,
won't be taken.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 06/15] powerpc/85xx: add support to JOG feature using cpufreq interface

2013-04-21 Thread Rafael J. Wysocki
  &new);
> +
> + freqs.old = policy->cur;
> + freqs.new = mpc85xx_freqs[new].frequency;
> + freqs.cpu = policy->cpu;
> +
> + mutex_lock(&mpc85xx_switch_mutex);
> + cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
> +
> + ret = set_pll(policy->cpu, mpc85xx_freqs[new].index);
> + if (!ret) {
> + pr_info("cpufreq: Setting core%d frequency to %d kHz and PLL 
> ratio to %d:2\n",
> +  policy->cpu, mpc85xx_freqs[new].frequency,
> +  mpc85xx_freqs[new].index);
> +
> + ppc_proc_freq = freqs.new * 1000ul;
> + }
> + cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
> + mutex_unlock(&mpc85xx_switch_mutex);
> +
> + return ret;
> +}
> +
> +static struct cpufreq_driver mpc85xx_cpufreq_driver = {
> + .verify = mpc85xx_cpufreq_verify,
> + .target = mpc85xx_cpufreq_target,
> + .init   = mpc85xx_cpufreq_cpu_init,
> + .exit   = mpc85xx_cpufreq_cpu_exit,
> + .name   = "mpc85xx-JOG",
> + .owner  = THIS_MODULE,
> + .flags  = CPUFREQ_CONST_LOOPS,
> +};
> +
> +static struct of_device_id mpc85xx_jog_ids[] = {
> + { .compatible = "fsl,mpc8536-guts", },
> + { .compatible = "fsl,p1022-guts", },
> + {}
> +};
> +
> +static int __init mpc85xx_jog_init(void)
> +{
> + struct device_node *np;
> + unsigned int svr;
> +
> + np = of_find_matching_node(NULL, mpc85xx_jog_ids);
> + if (!np)
> + return -ENODEV;
> +
> + guts = of_iomap(np, 0);
> + if (!guts) {
> + of_node_put(np);
> + return -ENODEV;
> + }
> +
> + sysfreq = fsl_get_sys_freq();
> +
> + if (of_device_is_compatible(np, "fsl,mpc8536-guts")) {
> + svr = mfspr(SPRN_SVR);
> + if ((svr & 0x7fff) == 0x10) {
> + pr_err("MPC8536 Rev 1.0 does not support 
> cpufreq(JOG).\n");
> + of_node_put(np);
> + return -ENODEV;
> + }
> + mpc85xx_freqs = mpc8536_freqs_table;
> + set_pll = mpc8536_set_pll;
> + max_pll[0] = get_pll(0);
> +
> + } else if (of_device_is_compatible(np, "fsl,p1022-guts")) {
> + mpc85xx_freqs = p1022_freqs_table;
> + set_pll = p1022_set_pll;
> + max_pll[0] = get_pll(0);
> + max_pll[1] = get_pll(1);
> + }
> +
> + pr_info("Freescale MPC85xx cpufreq(JOG) driver\n");
> +
> + of_node_put(np);
> + return cpufreq_register_driver(&mpc85xx_cpufreq_driver);
> +}
> +
> +device_initcall(mpc85xx_jog_init);
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/2] powerpc/pm: add api to get suspend state which is STANDBY or MEM

2014-04-29 Thread Rafael J. Wysocki
On Tuesday, April 29, 2014 05:47:12 PM Scott Wood wrote:
> On Mon, 2014-04-28 at 13:53 +0800, Leo Li wrote:
> > On Sat, Apr 26, 2014 at 5:45 AM, Scott Wood  wrote:
> > > On Thu, 2014-04-24 at 14:11 +0800, Dongsheng Wang wrote:
> > >> From: Wang Dongsheng 
> > >>
> > >> Add set_pm_suspend_state & pm_suspend_state functions to set/get
> > >> suspend state. When system going to sleep or deep sleep, devices
> > >> can get the system suspend state(STANDBY/MEM) through pm_suspend_state
> > >> function and to handle different situations.
> > >>
> > >> Signed-off-by: Wang Dongsheng 
> > >> ---
> > >> *v2*
> > >> Move pm api from fsl platform to powerpc general framework.
> > >
> > > What is powerpc-specific about this?
> > 
> > Generally I agree with you.  But I had the discussion about this topic
> > a while ago with the PM maintainer.  He suggestion to go with the
> > platform way.
> > 
> > https://lkml.org/lkml/2013/8/16/505
> 
> If what he meant was whether you could do what this patch does, then you
> can answer him with, "No, because it got nacked as not being platform or
> arch specific."  Oh, and you're still using .valid as the hook to set
> the platform state, which is awful -- I think .begin is what you want to
> use.
> 
> If we did it in powerpc code, then what would we do on ARM?  Copy the
> code?  No.
> 
> Now, a more legitimate objection to putting it in generic code might be
> that "standby" and "mem" are loosely defined and the knowledge of how a
> driver should react to each is platform specific -- but your patch
> doesn't address that.  You still have the driver itself interpret what
> "standby" and "mem" mean.
> 
> As for "in analogy with ACPI suspend operations", can someone familiar
> with ACPI explain what is meant?

ACPI defines sleep states S3 and S1 which are mappend to "mem" and
"standby" currently, but that may change in the future.

Generally speaking the meaning of "mem" and "standby" is platform-specific
except that "mem" should be a deeper (lower-power) sleep state than "standby".

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 0/9] cpuidle: rework device state count handling

2014-01-06 Thread Rafael J. Wysocki
On Friday, December 20, 2013 07:47:22 PM Bartlomiej Zolnierkiewicz wrote:
> Hi,
> 
> Some cpuidle drivers assume that cpuidle core will handle cases where
> device->state_count is smaller than driver->state_count, unfortunately
> currently this is untrue (device->state_count is used only for handling
> cpuidle state sysfs entries and driver->state_count is used for all
> other cases) and will not be fixed in the future as device->state_count
> is planned to be removed [1].
> 
> This patchset fixes such drivers (ARM EXYNOS cpuidle driver and ACPI
> cpuidle driver), removes superflous device->state_count initialization
> from drivers for which device->state_count equals driver->state_count
> (POWERPC pseries cpuidle driver and intel_idle driver) and finally
> removes state_count field from struct cpuidle_device.
> 
> Additionaly (while at it) this patchset fixes C1E promotion disable
> quirk handling (in intel_idle driver) and converts cpuidle drivers code
> to use the common cpuidle_[un]register() routines (in POWERPC pseries
> cpuidle driver and intel_idle driver).
> 
> [1] http://permalink.gmane.org/gmane.linux.power-management.general/36908
> 
> Reference to v1:
>   http://comments.gmane.org/gmane.linux.power-management.general/37390
> 
> Changes since v1:
> - synced patch series with next-20131220
> - added ACKs from Daniel Lezcano

I've queued up the series for 3.14, thanks!

> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R&D Institute Poland
> Samsung Electronics
> 
> 
> Bartlomiej Zolnierkiewicz (9):
>   ARM: EXYNOS: cpuidle: fix AFTR mode check
>   POWERPC: pseries: cpuidle: remove superfluous dev->state_count
> initialization
>   POWERPC: pseries: cpuidle: use the common cpuidle_[un]register()
> routines
>   ACPI / cpuidle: fix max idle state handling with hotplug CPU support
>   ACPI / cpuidle: remove dev->state_count setting
>   intel_idle: do C1E promotion disable quirk for hotplugged CPUs
>   intel_idle: remove superfluous dev->state_count initialization
>   intel_idle: use the common cpuidle_[un]register() routines
>   cpuidle: remove state_count field from struct cpuidle_device
> 
>  arch/arm/mach-exynos/cpuidle.c  |   8 +-
>  arch/powerpc/platforms/pseries/processor_idle.c |  59 +-
>  drivers/acpi/processor_idle.c   |  29 +++--
>  drivers/cpuidle/cpuidle.c   |   3 -
>  drivers/cpuidle/sysfs.c |   5 +-
>  drivers/idle/intel_idle.c               | 140 
> +---
>  include/linux/cpuidle.h |   1 -
>  7 files changed, 51 insertions(+), 194 deletions(-)
> 
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC] linux/pci: move pci_platform_pm_ops to linux/pci.h

2014-01-06 Thread Rafael J. Wysocki
On Friday, December 20, 2013 09:42:59 AM Bjorn Helgaas wrote:
> On Fri, Dec 20, 2013 at 3:03 AM, Dongsheng Wang
>  wrote:
> > From: Wang Dongsheng 
> >
> > make Freescale platform use pci_platform_pm_ops struct.
> 
> This changelog doesn't say anything about what the patch does.
> 
> I infer that you want to use pci_platform_pm_ops from some Freescale
> code.  This patch should be posted along with the patches that add
> that Freescale code, so we can see how you intend to use it.
> 
> The existing use is in drivers/pci/pci-acpi.c, so it's possible that
> your new use should be added in the same way, in drivers/pci, so we
> don't have to make pci_platform_pm_ops part of the public PCI
> interface in include/linux/pci.h.
> 
> That said, if Raphael thinks this makes sense, it's OK with me.

Well, I'd like to know why exactly the change is needed in the first place.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] fsl/pci: The new pci suspend/resume implementation

2014-01-07 Thread Rafael J. Wysocki
  list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
> + fsl_pci_syscore_do_suspend(hose);
>  
>   return 0;
>  }
>  
> -static const struct dev_pm_ops pci_pm_ops = {
> - .resume = fsl_pci_resume,
> -};
> +static void fsl_pci_syscore_do_resume(struct pci_controller *hose)
> +{
> + struct ccsr_pci __iomem *pci = hose->private_data;
> + u32 dr;
> + int i;
>  
> -#define PCI_PM_OPS (&pci_pm_ops)
> + /* Send Exit L2 State Message */
> + setbits32(&pci->pex_pmcr, PEX_PMCR_EXL2S);
>  
> -#else
> + /* wait exit done */
> + for (i = 0; i < 150; i++) {
> + dr = in_be32(&pci->pex_pme_mes_dr);
> + if (dr) {
> + out_be32(&pci->pex_pme_mes_dr, dr);
> + break;
> + } else {
> + udelay(1000);
> + }
> + }
> +
> + setup_pci_atmu(hose);
> +}
>  
> -#define PCI_PM_OPS NULL
> +static void fsl_pci_syscore_resume(void)
> +{
> + struct pci_controller *hose, *tmp;
> +
> + list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
> + fsl_pci_syscore_do_resume(hose);
> +}
>  
> +static struct syscore_ops pci_syscore_pm_ops = {
> + .suspend = fsl_pci_syscore_suspend,
> + .resume = fsl_pci_syscore_resume,
> +};
>  #endif
>  
> +void fsl_pcibios_fixup_phb(struct pci_controller *phb)
> +{
> +#ifdef CONFIG_PM
> + fsl_pci_pme_probe(phb);
> +#endif
> +}
> +
> +static int fsl_pci_probe(struct platform_device *pdev)
> +{
> + int ret;
> + struct device_node *node;
> +
> + node = pdev->dev.of_node;
> + ret = fsl_add_bridge(pdev, fsl_pci_primary == node);
> +
> + mpc85xx_pci_err_probe(pdev);
> +
> + return 0;
> +}
> +
>  static struct platform_driver fsl_pci_driver = {
>   .driver = {
>   .name = "fsl-pci",
> - .pm = PCI_PM_OPS,
>   .of_match_table = pci_ids,
>   },
>   .probe = fsl_pci_probe,
> @@ -1141,6 +1256,9 @@ static struct platform_driver fsl_pci_driver = {
>  
>  static int __init fsl_pci_init(void)
>  {
> +#ifdef CONFIG_PM
> + register_syscore_ops(&pci_syscore_pm_ops);
> +#endif
>   return platform_driver_register(&fsl_pci_driver);
>  }
>  arch_initcall(fsl_pci_init);
> diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
> index 8d455df..c1cec77 100644
> --- a/arch/powerpc/sysdev/fsl_pci.h
> +++ b/arch/powerpc/sysdev/fsl_pci.h
> @@ -32,6 +32,13 @@ struct platform_device;
>  #define PIWAR_WRITE_SNOOP0x5000
>  #define PIWAR_SZ_MASK  0x003f
>  
> +#define PEX_PMCR_PTOMR   0x1
> +#define PEX_PMCR_EXL2S   0x2
> +
> +#define PME_DISR_EN_PTOD 0x8000
> +#define PME_DISR_EN_ENL23D   0x2000
> +#define PME_DISR_EN_EXL23D   0x1000
> +
>  /* PCI/PCI Express outbound window reg */
>  struct pci_outbound_window_regs {
>   __be32  potar;  /* 0x.0 - Outbound translation address register */
> @@ -111,6 +118,7 @@ struct ccsr_pci {
>  
>  extern int fsl_add_bridge(struct platform_device *pdev, int is_primary);
>  extern void fsl_pcibios_fixup_bus(struct pci_bus *bus);
> +extern void fsl_pcibios_fixup_phb(struct pci_controller *phb);
>  extern int mpc83xx_add_bridge(struct device_node *dev);
>  u64 fsl_pci_immrbar_base(struct pci_controller *hose);
>  
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/9] cpuidle: rework device state count handling

2014-01-10 Thread Rafael J. Wysocki
On Friday, December 20, 2013 07:47:22 PM Bartlomiej Zolnierkiewicz wrote:
> Hi,
> 
> Some cpuidle drivers assume that cpuidle core will handle cases where
> device->state_count is smaller than driver->state_count, unfortunately
> currently this is untrue (device->state_count is used only for handling
> cpuidle state sysfs entries and driver->state_count is used for all
> other cases) and will not be fixed in the future as device->state_count
> is planned to be removed [1].
> 
> This patchset fixes such drivers (ARM EXYNOS cpuidle driver and ACPI
> cpuidle driver), removes superflous device->state_count initialization
> from drivers for which device->state_count equals driver->state_count
> (POWERPC pseries cpuidle driver and intel_idle driver) and finally
> removes state_count field from struct cpuidle_device.
> 
> Additionaly (while at it) this patchset fixes C1E promotion disable
> quirk handling (in intel_idle driver) and converts cpuidle drivers code
> to use the common cpuidle_[un]register() routines (in POWERPC pseries
> cpuidle driver and intel_idle driver).
> 
> [1] http://permalink.gmane.org/gmane.linux.power-management.general/36908
> 
> Reference to v1:
>   http://comments.gmane.org/gmane.linux.power-management.general/37390
> 
> Changes since v1:
> - synced patch series with next-20131220
> - added ACKs from Daniel Lezcano

This series breaks boot on one of my test machines with intel_idle, so I'm
not sure how well it has been tested.

I've dropped it entirely for now.  If I have the time, I will try to identify
the root cause of the failure, but that may not happen before the merge window.
Sorry about that.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/9] cpuidle: rework device state count handling

2014-01-13 Thread Rafael J. Wysocki
On Saturday, January 11, 2014 01:37:29 AM Rafael J. Wysocki wrote:
> On Friday, December 20, 2013 07:47:22 PM Bartlomiej Zolnierkiewicz wrote:
> > Hi,
> > 
> > Some cpuidle drivers assume that cpuidle core will handle cases where
> > device->state_count is smaller than driver->state_count, unfortunately
> > currently this is untrue (device->state_count is used only for handling
> > cpuidle state sysfs entries and driver->state_count is used for all
> > other cases) and will not be fixed in the future as device->state_count
> > is planned to be removed [1].
> > 
> > This patchset fixes such drivers (ARM EXYNOS cpuidle driver and ACPI
> > cpuidle driver), removes superflous device->state_count initialization
> > from drivers for which device->state_count equals driver->state_count
> > (POWERPC pseries cpuidle driver and intel_idle driver) and finally
> > removes state_count field from struct cpuidle_device.
> > 
> > Additionaly (while at it) this patchset fixes C1E promotion disable
> > quirk handling (in intel_idle driver) and converts cpuidle drivers code
> > to use the common cpuidle_[un]register() routines (in POWERPC pseries
> > cpuidle driver and intel_idle driver).
> > 
> > [1] http://permalink.gmane.org/gmane.linux.power-management.general/36908
> > 
> > Reference to v1:
> > http://comments.gmane.org/gmane.linux.power-management.general/37390
> > 
> > Changes since v1:
> > - synced patch series with next-20131220
> > - added ACKs from Daniel Lezcano
> 
> This series breaks boot on one of my test machines with intel_idle, so I'm
> not sure how well it has been tested.
> 
> I've dropped it entirely for now.  If I have the time, I will try to identify
> the root cause of the failure, but that may not happen before the merge 
> window.
> Sorry about that.

The breakage was introduced by patch [8/9], so I've re-applied patches [1-7/9]
from this series.  Please refer to Fengguang's report [1] for the breakage
details.

Thanks!

[1] http://marc.info/?l=linux-kernel&m=138964167909907&w=2

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V3 3/3] time/cpuidle:Handle failed call to BROADCAST_ENTER on archs with CPUIDLE_FLAG_TIMER_STOP set

2014-02-06 Thread Rafael J. Wysocki
On Thursday, February 06, 2014 11:20:37 AM Preeti U Murthy wrote:
> Some archs set the CPUIDLE_FLAG_TIMER_STOP flag for idle states in which the
> local timers stop. The cpuidle_idle_call() currently handles such idle states
> by calling into the broadcast framework so as to wakeup CPUs at their next
> wakeup event. With the hrtimer mode of broadcast, the BROADCAST_ENTER call
> into the broadcast frameowork can fail for archs that do not have an external
> clock device to handle wakeups and the CPU in question has to thus be made
> the stand by CPU. This patch handles such cases by failing the call into
> cpuidle so that the arch can take some default action. The arch will certainly
> not enter a similar idle state because a failed cpuidle call will also 
> implicitly
> indicate that the broadcast framework has not registered this CPU to be woken 
> up.
> Hence we are safe if we fail the cpuidle call.
> 
> In the process move the functions that trace idle statistics just before and
> after the entry and exit into idle states respectively. In other
> scenarios where the call to cpuidle fails, we end up not tracing idle
> entry and exit since a decision on an idle state could not be taken. Similarly
> when the call to broadcast framework fails, we skip tracing idle statistics
> because we are in no further position to take a decision on an alternative
> idle state to enter into.
> 
> Signed-off-by: Preeti U Murthy 

The cpuidle changes in this patch look reasonable to me, but I guess you'd
like it to go in via tip along with the rest of the series, so

Acked-by: Rafael J. Wysocki 

> ---
> 
>  drivers/cpuidle/cpuidle.c |   38 +++---
>  1 file changed, 23 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index a55e68f..8f42033 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -117,15 +117,19 @@ int cpuidle_idle_call(void)
>  {
>   struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices);
>   struct cpuidle_driver *drv;
> - int next_state, entered_state;
> - bool broadcast;
> + int next_state, entered_state, ret = 0;
> + bool broadcast = false;
>  
> - if (off || !initialized)
> - return -ENODEV;
> + if (off || !initialized) {
> + ret = -ENODEV;
> + goto out;
> + }
>  
>   /* check if the device is ready */
> - if (!dev || !dev->enabled)
> - return -EBUSY;
> + if (!dev || !dev->enabled) {
> + ret = -EBUSY;
> + goto out;
> + }
>  
>   drv = cpuidle_get_cpu_driver(dev);
>  
> @@ -137,15 +141,18 @@ int cpuidle_idle_call(void)
>   if (cpuidle_curr_governor->reflect)
>   cpuidle_curr_governor->reflect(dev, next_state);
>   local_irq_enable();
> - return 0;
> + goto out;
>   }
>  
> - trace_cpu_idle_rcuidle(next_state, dev->cpu);
> -
>   broadcast = !!(drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP);
>  
> - if (broadcast)
> - clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
> + if (broadcast) {
> + ret = clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, 
> &dev->cpu);
> + if (ret)
> + goto out;
> + }
> +
> + trace_cpu_idle_rcuidle(next_state, dev->cpu);
>  
>   if (cpuidle_state_is_coupled(dev, drv, next_state))
>   entered_state = cpuidle_enter_state_coupled(dev, drv,
> @@ -153,16 +160,17 @@ int cpuidle_idle_call(void)
>   else
>   entered_state = cpuidle_enter_state(dev, drv, next_state);
>  
> - if (broadcast)
> - clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
> -
>   trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
>  
>   /* give the governor an opportunity to reflect on the outcome */
>   if (cpuidle_curr_governor->reflect)
>   cpuidle_curr_governor->reflect(dev, entered_state);
>  
> - return 0;
> +out: if (broadcast)
> +         clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
> +
> +
> + return ret;
>  }
>  
>  /**
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V4 3/3] time/cpuidle:Handle failed call to BROADCAST_ENTER on archs with CPUIDLE_FLAG_TIMER_STOP set

2014-02-07 Thread Rafael J. Wysocki
On Friday, February 07, 2014 01:36:52 PM Preeti U Murthy wrote:
> Some archs set the CPUIDLE_FLAG_TIMER_STOP flag for idle states in which the
> local timers stop. The cpuidle_idle_call() currently handles such idle states
> by calling into the broadcast framework so as to wakeup CPUs at their next
> wakeup event. With the hrtimer mode of broadcast, the BROADCAST_ENTER call
> into the broadcast frameowork can fail for archs that do not have an external
> clock device to handle wakeups and the CPU in question has to thus be made
> the stand by CPU. This patch handles such cases by failing the call into
> cpuidle so that the arch can take some default action. The arch will certainly
> not enter a similar idle state because a failed cpuidle call will also 
> implicitly
> indicate that the broadcast framework has not registered this CPU to be woken 
> up.
> Hence we are safe if we fail the cpuidle call.
> 
> In the process move the functions that trace idle statistics just before and
> after the entry and exit into idle states respectively. In other
> scenarios where the call to cpuidle fails, we end up not tracing idle
> entry and exit since a decision on an idle state could not be taken. Similarly
> when the call to broadcast framework fails, we skip tracing idle statistics
> because we are in no further position to take a decision on an alternative
> idle state to enter into.
> 
> Signed-off-by: Preeti U Murthy 

Acked-by: Rafael J. Wysocki 

> ---
> 
>  drivers/cpuidle/cpuidle.c |   14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index a55e68f..8beb0f02 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -140,12 +140,14 @@ int cpuidle_idle_call(void)
>   return 0;
>   }
>  
> - trace_cpu_idle_rcuidle(next_state, dev->cpu);
> -
>   broadcast = !!(drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP);
>  
> - if (broadcast)
> - clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
> + if (broadcast &&
> + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu))
> + return -EBUSY;
> +
> +
> + trace_cpu_idle_rcuidle(next_state, dev->cpu);
>  
>   if (cpuidle_state_is_coupled(dev, drv, next_state))
>   entered_state = cpuidle_enter_state_coupled(dev, drv,
> @@ -153,11 +155,11 @@ int cpuidle_idle_call(void)
>   else
>   entered_state = cpuidle_enter_state(dev, drv, next_state);
>  
> + trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
> +
>   if (broadcast)
>   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
>  
> - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
> -
>   /* give the governor an opportunity to reflect on the outcome */
>   if (cpuidle_curr_governor->reflect)
>   cpuidle_curr_governor->reflect(dev, entered_state);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/6] PCI, acpiphp: Use list_for_each_entry() for bus traversal

2014-02-13 Thread Rafael J. Wysocki
On Thursday, February 13, 2014 09:13:58 PM Yijing Wang wrote:
> Replace list_for_each() + pci_bus_b() with the simpler
> list_for_each_entry().
> 
> Signed-off-by: Yijing Wang 

Looks reasonable to me.

Does it conflict with anything currently in linux-next (the linux-next branch
of linux-pm.git in particular)?

> ---
>  drivers/pci/hotplug/acpiphp_glue.c |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c 
> b/drivers/pci/hotplug/acpiphp_glue.c
> index cd929ae..aee6a0a 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -450,7 +450,7 @@ static void cleanup_bridge(struct acpiphp_bridge *bridge)
>   */
>  static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
>  {
> - struct list_head *tmp;
> + struct pci_bus *tmp;
>   unsigned char max, n;
>  
>   /*
> @@ -463,8 +463,8 @@ static unsigned char acpiphp_max_busnr(struct pci_bus 
> *bus)
>*/
>   max = bus->busn_res.start;
>  
> - list_for_each(tmp, &bus->children) {
> - n = pci_bus_max_busnr(pci_bus_b(tmp));
> + list_for_each_entry(tmp, &bus->children, node) {
> + n = pci_bus_max_busnr(tmp);
>       if (n > max)
>   max = n;
>   }
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


  1   2   3   4   >