Re: [PATCH] cpuidle: reinitialize power_usage values when adding/removing C-states
> This is specific to the acpi and should be handled in the > processor_idle.c file instead of the cpuidle core code. > > Could be the function 'acpi_processor_cst_has_changed' the right place > to set a dummy power value for the power in the new C-state ? Thanks for your feedback. I think it wouldn't be wise to split the dummy power value logic over two places, but I could submit a patch that makes set_power_states globally accessible and calls it from acpi_processor_cst_has_changed instead. However, I do not think this should really be ACPI specific. It applies to any cpuidle driver that wants to change its idle states at runtime. Currently only the ACPI one does, but the future might bring others that would run into the same problem. I also think that set_power_states fits much better into cpuidle_enable_device conceptually anyway (right next to poll_idle_init which also does state initialization). Let me know what you think. ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH] acpi/cpuidle: reinitialize power_usage values when adding/removing C-states
When cpuidle drivers do not supply explicit power_usage values, cpuidle/driver.c inserts dummy values instead. When a running processor dynamically gains new C-states (e.g. after ACPI events), the power_usage values of those states will stay uninitialized, and cpuidle governors will never choose to enter them. This patch ensures that the ACPI cpuidle driver sets those dummy power values itself whenever it (re-)initializes its idle states. Tested and verified on an Acer AC700 Chromebook, which supports the C3 state only when it switches from AC to battery power. Signed-off-by: Julius Werner --- drivers/acpi/processor_idle.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index e8086c7..078e22f 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1090,6 +1090,9 @@ static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr) state->exit_latency = cx->latency; state->target_residency = cx->latency * latency_factor; + /* reinitialize this in case new states are added after boot */ + state->power_usage = -1 - count; + state->flags = 0; switch (cx->type) { case ACPI_STATE_C1: -- 1.7.8.6 ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [PATCH] acpi/cpuidle: reinitialize power_usage values when adding/removing C-states
> Could we just say this is always true because state[i+1] consumes less > than state[i] ? > > And then just remove the 'set_power_state' function, and the field > 'driver->power_specified' ? > > That will cleanup the code and fix this problem, no ? I totally agree with your analysis. Even if a driver were to set proper usage values (and the power_specified bit), none of the existing governors would care about those actual numbers (and since the vast majority of drivers uses fake values anyway, this is not likely to change in the future). This seems to be a classic example of unnecessary over-engineering. I am mostly interested in getting that bug fixed right now, but removing unnecessary code is always a good thing. If you think it would have a good chance of getting merged, I would be happy to draft up a larger patch that refactors power_usage away completely. ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [RFC] cpuidle - remove the power_specified field in the driver
Thanks for moving this along, Daniel. I think this is the right approach... the cpuidle driver shouldn't be more complex than necessary. Note that you are starting your loop too high in cpuidle_play_dead... states[state_count] is not an actual state anymore, it should start at state_count - 1. Also, I think you can go ahead and do the same last-to-first loop transformation with immediate return in the menu governor, for an extra tiny bit of performance. On Mon, Nov 12, 2012 at 12:26 PM, Daniel Lezcano wrote: > This patch follows the discussion about reinitializing the power usage > when a C-state is added/removed. > > https://lkml.org/lkml/2012/10/16/518 > > We realized the power usage field is never filled and when it is > filled for tegra, the power_specified flag is not set making all these > values to be resetted when the driver is initialized with the set_power_state > function. > > Julius and I feel this is over-engineered and the power_specified > flag could be simply removed and continue assuming the states are > backward sorted. > > The menu governor select function is simplified as the power is ordered. > Actually the condition is always true with the current code. > > The cpuidle_play_dead function is also simplified by doing a reverse lookup > on the array. > > The set_power_states function is removed as it does no make sense anymore. > > Signed-off-by: Daniel Lezcano > --- > drivers/cpuidle/cpuidle.c| 17 - > drivers/cpuidle/driver.c | 25 - > drivers/cpuidle/governors/menu.c |8 ++-- > include/linux/cpuidle.h |2 +- > 4 files changed, 7 insertions(+), 45 deletions(-) > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 711dd83..f983262 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -69,24 +69,15 @@ int cpuidle_play_dead(void) > { > struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices); > struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); > - int i, dead_state = -1; > - int power_usage = -1; > + int i; > > if (!drv) > return -ENODEV; > > /* Find lowest-power state that supports long-term idle */ > - for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { > - struct cpuidle_state *s = &drv->states[i]; > - > - if (s->power_usage < power_usage && s->enter_dead) { > - power_usage = s->power_usage; > - dead_state = i; > - } > - } > - > - if (dead_state != -1) > - return drv->states[dead_state].enter_dead(dev, dead_state); > + for (i = drv->state_count; i >= CPUIDLE_DRIVER_STATE_START; i--) > + if (drv->states[i].play_dead) > + return drv->states[i].enter_dead(dev, i); > > return -ENODEV; > } > diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c > index 3af841f..bb045f1 100644 > --- a/drivers/cpuidle/driver.c > +++ b/drivers/cpuidle/driver.c > @@ -19,34 +19,9 @@ DEFINE_SPINLOCK(cpuidle_driver_lock); > static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu); > static struct cpuidle_driver * __cpuidle_get_cpu_driver(int cpu); > > -static void set_power_states(struct cpuidle_driver *drv) > -{ > - int i; > - > - /* > -* cpuidle driver should set the drv->power_specified bit > -* before registering if the driver provides > -* power_usage numbers. > -* > -* If power_specified is not set, > -* we fill in power_usage with decreasing values as the > -* cpuidle code has an implicit assumption that state Cn > -* uses less power than C(n-1). > -* > -* With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned > -* an power value of -1. So we use -2, -3, etc, for other > -* c-states. > -*/ > - for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) > - drv->states[i].power_usage = -1 - i; > -} > - > static void __cpuidle_driver_init(struct cpuidle_driver *drv) > { > drv->refcnt = 0; > - > - if (!drv->power_specified) > - set_power_states(drv); > } > > static int __cpuidle_register_driver(struct cpuidle_driver *drv, int cpu) > diff --git a/drivers/cpuidle/governors/menu.c > b/drivers/cpuidle/governors/menu.c > index 2efee27..14eb11f 100644 > --- a/drivers/cpuidle/governors/menu.c > +++ b/drivers/cpuidle/governors/menu.c > @@ -312,7 +312,6 @@ static int menu_select(struct cpuidle_driver *drv, struct > cpuidle_device *dev) > { > struct menu_device *data = &__get_cpu_var(menu_devices); > int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); > - int power_usage = -1; > int i; > int multiplier; > struct timespec t; > @@ -383,11 +382,8 @@ static int menu_select(stru
Re: [PATCH] cpuidle: Measure idle state durations with monotonic clock
> Maybe you can remove all these computations and set the flag > en_core_tk_irqen for the driver ? That will be handled by the cpuidle > framework, no ? > > Same comment for the intel_idle driver. Yeah, I thought about that, too. I was a little too afraid of touching the sched_clock_idle_wakeup_event() parameter that is tied to the measurement, but it seems to have been vestigial for some time now and other drivers also just set it 0. I will whip up another version of the patch (won't change the PPC further though, if this version works I would just leave it at that... thanks for testing, Deepthi). ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
[PATCH] cpuidle: Measure idle state durations with monotonic clock
Many cpuidle drivers measure their time spent in an idle state by reading the wallclock time before and after idling and calculating the difference. This leads to erroneous results when the wallclock time gets updated by another processor in the meantime, adding that clock adjustment to the idle state's time counter. If the clock adjustment was negative, the result is even worse due to an erroneous cast from int to unsigned long long of the last_residency variable. The negative 32 bit integer will zero-extend and result in a forward time jump of roughly four billion milliseconds or 1.3 hours on the idle state residency counter. This patch changes all affected cpuidle drivers to either use the monotonic clock for their measurements or make use of the generic time measurement wrapper in cpuidle.c, which was already working correctly. Some superfluous CLIs/STIs in the ACPI code are removed (interrupts should always already be disabled before entering the idle function, and not get reenabled until the generic wrapper has performed its second measurement). It also removes the erroneous cast, making sure that negative residency values are applied correctly even though they should not appear anymore. Signed-off-by: Julius Werner --- arch/powerpc/platforms/pseries/processor_idle.c |4 +- drivers/acpi/processor_idle.c | 57 +- drivers/cpuidle/cpuidle.c |3 +- drivers/idle/intel_idle.c | 14 +- 4 files changed, 7 insertions(+), 71 deletions(-) diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c index 45d00e5..4d806b4 100644 --- a/arch/powerpc/platforms/pseries/processor_idle.c +++ b/arch/powerpc/platforms/pseries/processor_idle.c @@ -36,7 +36,7 @@ static struct cpuidle_state *cpuidle_state_table; static inline void idle_loop_prolog(unsigned long *in_purr, ktime_t *kt_before) { - *kt_before = ktime_get_real(); + *kt_before = ktime_get(); *in_purr = mfspr(SPRN_PURR); /* * Indicate to the HV that we are idle. Now would be @@ -50,7 +50,7 @@ static inline s64 idle_loop_epilog(unsigned long in_purr, ktime_t kt_before) get_lppaca()->wait_state_cycles += mfspr(SPRN_PURR) - in_purr; get_lppaca()->idle = 0; - return ktime_to_us(ktime_sub(ktime_get_real(), kt_before)); + return ktime_to_us(ktime_sub(ktime_get(), kt_before)); } static int snooze_loop(struct cpuidle_device *dev, diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index e8086c7..f1a5da4 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -735,31 +735,18 @@ static inline void acpi_idle_do_entry(struct acpi_processor_cx *cx) static int acpi_idle_enter_c1(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - ktime_t kt1, kt2; - s64 idle_time; struct acpi_processor *pr; struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage); pr = __this_cpu_read(processors); - dev->last_residency = 0; if (unlikely(!pr)) return -EINVAL; - local_irq_disable(); - - lapic_timer_state_broadcast(pr, cx, 1); - kt1 = ktime_get_real(); acpi_idle_do_entry(cx); - kt2 = ktime_get_real(); - idle_time = ktime_to_us(ktime_sub(kt2, kt1)); - - /* Update device last_residency*/ - dev->last_residency = (int)idle_time; - local_irq_enable(); lapic_timer_state_broadcast(pr, cx, 0); return index; @@ -806,19 +793,12 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev, struct acpi_processor *pr; struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage); - ktime_t kt1, kt2; - s64 idle_time_ns; - s64 idle_time; pr = __this_cpu_read(processors); - dev->last_residency = 0; if (unlikely(!pr)) return -EINVAL; - local_irq_disable(); - - if (cx->entry_method != ACPI_CSTATE_FFH) { current_thread_info()->status &= ~TS_POLLING; /* @@ -829,7 +809,6 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev, if (unlikely(need_resched())) { current_thread_info()->status |= TS_POLLING; - local_irq_enable(); return -EINVAL; } } @@ -843,22 +822,12 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev, if (cx->type == ACPI_STATE_C3) ACPI_FLUSH_CPU_CACHE(); - kt1 = ktime_get_real(); /* Tell the scheduler that we are going
Re: [RFC] cpuidle - remove the power_specified field in the driver
Hi, What is the current status of this? Daniel, do you think you have got enough feedback to submit a definitive patch for this? Rafael, would you approve of such a change? The bug with dynamically added C-states that is tied to this still hurts the battery life for some users across all distros every day, so I think it would be valuable to get a consistent solution into the mainline soon before everyone has to roll their own. On 11/12/2012 09:26 PM, Daniel Lezcano wrote: > This patch follows the discussion about reinitializing the power usage > when a C-state is added/removed. > > https://lkml.org/lkml/2012/10/16/518 > > We realized the power usage field is never filled and when it is > filled for tegra, the power_specified flag is not set making all these > values to be resetted when the driver is initialized with the set_power_state > function. > > Julius and I feel this is over-engineered and the power_specified > flag could be simply removed and continue assuming the states are > backward sorted. > > The menu governor select function is simplified as the power is ordered. > Actually the condition is always true with the current code. > > The cpuidle_play_dead function is also simplified by doing a reverse lookup > on the array. > > The set_power_states function is removed as it does no make sense anymore. > > Signed-off-by: Daniel Lezcano > --- > drivers/cpuidle/cpuidle.c| 17 - > drivers/cpuidle/driver.c | 25 - > drivers/cpuidle/governors/menu.c |8 ++-- > include/linux/cpuidle.h |2 +- > 4 files changed, 7 insertions(+), 45 deletions(-) > ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [PATCH] cpuidle - remove the power_specified field in the driver
Thanks again for making this happen, Daniel. I like this version, except for the small nitpick that I still think it would make sense to also turn the loop in menu.c around. How about something like this: for (i = drv->state_count - 1; i >= CPUIDLE_DRIVER_STATE_START; i++) { struct cpuidle_state *s = &drv->states[i]; if (!s->disable && s->exit_latency <= latency_req && s->target_residency <= data->predicted_us && s->exit_latency * multiplier <= data->predicted_us) { data->last_state_idx = i; data->exit_us = s->exit_latency; break; } } ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [PATCH 2/2][V2] cpuidle - optimize the select function for the 'menu' governor
*Ping!* Happy New Year everyone. Is there any update on this? I think Francesco already pointed out how to solve the last remaining issue with this, so I hope we can now resubmit these patches and finally get them merged. Daniel? ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev