from:"Javi Merino"

Re: [PATCH] irqchip: gic: Don't complain in gic_get_cpumask() if UP system

2013-07-12 Thread Javi Merino

On Sat, Jul 06, 2013 at 12:39:33AM +0100, Stephen Boyd wrote:
> In a uniprocessor implementation the interrupt processor targets
> registers are read-as-zero/write-ignored (RAZ/WI). Unfortunately
> gic_get_cpumask() will print a critical message saying
> 
>  GIC CPU mask not found - kernel will fail to boot.
> 
> if these registers all read as zero, but there won't actually be
> a problem on uniprocessor systems and the kernel will boot just
> fine. Skip this check if we're running a UP kernel or if we
> detect that the hardware only supports a single processor.
> 
> Cc: Nicolas Pitre 
> Cc: Russell King 
> Signed-off-by: Stephen Boyd 
> ---
> 
> Maybe we should just drop the check entirely? It looks like it may
> just be debug code that won't ever trigger in practice, even on the
> 11MPCore that caused this code to be introduced.

I agree, we should drop the check.  It's annoying in uniprocessors and
unlikely to be found in the real world unless your gic entry in the dt
is wrong.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] smp: harmonize prototypes of smp functions

2013-09-18 Thread Javi Merino

On Tue, Sep 17, 2013 at 10:22:28PM +0100, Andrew Morton wrote:
> On Mon,  2 Sep 2013 15:33:13 +0100 Javi Merino  wrote:
> 
> > Avoid unnecessary casts from int to bool in smp functions.  Some
> > functions in kernel/smp.c have a wait parameter that can be set to one
> > if you want to wait for the command to complete.  It's defined as bool
> > in a few of them and int in the rest.  If a function with wait
> > declared as int calls a function whose prototype has wait defined as
> > bool, the compiler needs to test if the int is != 0 and change it to 1
> > if so.  This useless check can be avoided if we are consistent and
> > make all the functions use the same type for this parameter.
> 
> Yes, that's a problem with bool.
> 
> But the `wait' argument *is* a boolean and switching everything over to
> use "bool" (instead of "int") should provide similar code-size savings.
> Did you evaluate that approach?

I did; you get exactly the same code-size savings.  But then I read
this[0] and thought that "int" was preferred.

[0] https://lkml.org/lkml/2013/8/31/138

I can submit the "bool" patch instead if you prefer it.  Cheers,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND] smp: harmonize prototypes of smp functions

2013-09-10 Thread Javi Merino

Avoid unnecessary casts from int to bool in smp functions.  Some
functions in kernel/smp.c have a wait parameter that can be set to one
if you want to wait for the command to complete.  It's defined as bool
in a few of them and int in the rest.  If a function with wait
declared as int calls a function whose prototype has wait defined as
bool, the compiler needs to test if the int is != 0 and change it to 1
if so.  This useless check can be avoided if we are consistent and
make all the functions use the same type for this parameter.

For example in arm, before this patch:

800464e4 :
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2   ; move wait to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   3300addsr3, #0   ; test if wait is 0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   4601mov r1, r0
800464f6:   bf18it  ne
800464f8:   2301movne   r3, #1   ; if it is not, wait = 1
800464fa:   462amov r2, r5
800464fc:   6820ldr r0, [r4, #0]
800464fe:   f7ff fea9   bl  80046254 
80046502:   2000movsr0, #0
80046504:   bd38pop {r3, r4, r5, pc}
80046506:   bf00nop

After the patch:

800464e4 :
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2  ; just move it to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   4601mov r1, r0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   462amov r2, r5
800464f6:   6820ldr r0, [r4, #0]
800464f8:   f7ff feac   bl  80046254 
800464fc:   2000movsr0, #0
800464fe:   bd38pop {r3, r4, r5, pc}

Same for x86.  Before:

8109bf10 :
8109bf10:   55  push   %rbp
8109bf11:   48 89 e5mov%rsp,%rbp
8109bf14:   31 c9   xor%ecx,%ecx  ; ecx = 0
8109bf16:   85 d2   test   %edx,%edx  ; test if 
wait is 0
8109bf18:   48 89 f2mov%rsi,%rdx
8109bf1b:   48 89 femov%rdi,%rsi
8109bf1e:   48 8b 3d 4b d3 76 00mov0x76d34b(%rip),%rdi  
  # 81809270 
8109bf25:   0f 95 c1setne  %cl; if it is 
not, ecx = 1
8109bf28:   e8 43 fc ff ff  callq  8109bb70 

8109bf2d:   31 c0   xor%eax,%eax
8109bf2f:   5d  pop%rbp
8109bf30:   c3  retq

After:

8109bf20 :
8109bf20:   55  push   %rbp
8109bf21:   48 89 e5mov%rsp,%rbp
8109bf24:   89 d1   mov%edx,%ecx  ; just move 
wait to ecx
8109bf26:   48 89 f2mov%rsi,%rdx
8109bf29:   48 89 femov%rdi,%rsi
8109bf2c:   48 8b 3d 3d d3 76 00mov0x76d33d(%rip),%rdi  
  # 81809270 
8109bf33:   e8 48 fc ff ff  callq  8109bb80 

8109bf38:   31 c0   xor%eax,%eax
8109bf3a:   5d  pop%rbp
8109bf3b:   c3  retq
8109bf3c:   0f 1f 40 00 nopl   0x0(%rax)

Cc: Andrew Morton 
Signed-off-by: Javi Merino 
---
 include/linux/smp.h |6 +++---
 kernel/smp.c|6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index c181399..a894405 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -72,7 +72,7 @@ extern void smp_cpus_done(unsigned int max_cpus);
  */
 int smp_call_function(smp_call_func_t func, void *info, int wait);
 void smp_call_function_many(const struct cpumask *mask,
-   smp_call_func_t func, void *info, bool wait);
+   smp_call_func_t func, void *info, int wait);
 
 void __smp_call_function_single(int cpuid, struct call_single_data *data,
int wait);
@@ -104,7 +104,7 @@ int on_each_cpu(smp_call_func_t func, void *info, int wait);
  * the local one.
  */
 void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
-   void *info, bool wait);
+   void *info, int wait);
 
 /*
  * Call a function on each processor for which the supplied function
@@ -112,7 +112,7 @@ void on_each_cpu_mask(const struct cpumask *mask, 
smp_call_func_t func,
  * processor.
  */
 void on_each_cpu_cond(bool (*cond_func)(int cpu, void

[PATCH] smp: harmonize prototypes of smp functions

2013-09-02 Thread Javi Merino

Avoid unnecessary casts from int to bool in smp functions.  Some
functions in kernel/smp.c have a wait parameter that can be set to one
if you want to wait for the command to complete.  It's defined as bool
in a few of them and int in the rest.  If a function with wait
declared as int calls a function whose prototype has wait defined as
bool, the compiler needs to test if the int is != 0 and change it to 1
if so.  This useless check can be avoided if we are consistent and
make all the functions use the same type for this parameter.

For example in arm, before this patch:

800464e4 :
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2   ; move wait to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   3300addsr3, #0   ; test if wait is 0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   4601mov r1, r0
800464f6:   bf18it  ne
800464f8:   2301movne   r3, #1   ; if it is not, wait = 1
800464fa:   462amov r2, r5
800464fc:   6820ldr r0, [r4, #0]
800464fe:   f7ff fea9   bl  80046254 
80046502:   2000movsr0, #0
80046504:   bd38pop {r3, r4, r5, pc}
80046506:   bf00nop

After the patch:

800464e4 :
800464e4:   b538push{r3, r4, r5, lr}
800464e6:   460dmov r5, r1
800464e8:   4613mov r3, r2  ; just move it to r3
800464ea:   f64f 448c   movwr4, #64652
800464ee:   4601mov r1, r0
800464f0:   f2c8 0425   movtr4, #32805
800464f4:   462amov r2, r5
800464f6:   6820ldr r0, [r4, #0]
800464f8:   f7ff feac   bl  80046254 
800464fc:   2000movsr0, #0
800464fe:   bd38pop {r3, r4, r5, pc}

Same for x86.  Before:

8109bf10 :
8109bf10:   55  push   %rbp
8109bf11:   48 89 e5mov%rsp,%rbp
8109bf14:   31 c9   xor%ecx,%ecx  ; ecx = 0
8109bf16:   85 d2   test   %edx,%edx  ; test if 
wait is 0
8109bf18:   48 89 f2mov%rsi,%rdx
8109bf1b:   48 89 femov%rdi,%rsi
8109bf1e:   48 8b 3d 4b d3 76 00mov0x76d34b(%rip),%rdi  
  # 81809270 
8109bf25:   0f 95 c1setne  %cl; if it is 
not, ecx = 1
8109bf28:   e8 43 fc ff ff  callq  8109bb70 

8109bf2d:   31 c0   xor%eax,%eax
8109bf2f:   5d  pop%rbp
8109bf30:   c3  retq

After:

8109bf20 :
8109bf20:   55  push   %rbp
8109bf21:   48 89 e5mov%rsp,%rbp
8109bf24:   89 d1   mov%edx,%ecx  ; just move 
wait to ecx
8109bf26:   48 89 f2mov%rsi,%rdx
8109bf29:   48 89 femov%rdi,%rsi
8109bf2c:   48 8b 3d 3d d3 76 00mov0x76d33d(%rip),%rdi  
  # 81809270 
8109bf33:   e8 48 fc ff ff  callq  8109bb80 

8109bf38:   31 c0   xor%eax,%eax
8109bf3a:   5d  pop%rbp
8109bf3b:   c3  retq
8109bf3c:   0f 1f 40 00 nopl   0x0(%rax)

Cc: Andrew Morton 
Signed-off-by: Javi Merino 
---
 include/linux/smp.h |6 +++---
 kernel/smp.c|6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index c181399..a894405 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -72,7 +72,7 @@ extern void smp_cpus_done(unsigned int max_cpus);
  */
 int smp_call_function(smp_call_func_t func, void *info, int wait);
 void smp_call_function_many(const struct cpumask *mask,
-   smp_call_func_t func, void *info, bool wait);
+   smp_call_func_t func, void *info, int wait);
 
 void __smp_call_function_single(int cpuid, struct call_single_data *data,
int wait);
@@ -104,7 +104,7 @@ int on_each_cpu(smp_call_func_t func, void *info, int wait);
  * the local one.
  */
 void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
-   void *info, bool wait);
+   void *info, int wait);
 
 /*
  * Call a function on each processor for which the supplied function
@@ -112,7 +112,7 @@ void on_each_cpu_mask(const struct cpumask *mask, 
smp_call_func_t func,
  * processor.
  */
 void on_each_cpu_cond(bool (*cond_func)(int cpu, void

[PATCH] of: add missing documentation for of_platform_populate()

2012-11-23 Thread Javi Merino

15c3597d (dt/platform: allow device name to be overridden) added a
lookup parameter to of_platform_populate() but did not update the
documentation.  This patch adds the missing documentation entry.

Cc: Grant Likely 
Cc: Jiri Kosina 
Signed-off-by: Javi Merino 
---
 drivers/of/platform.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index b80891b..e0a6514 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -436,6 +436,7 @@ EXPORT_SYMBOL(of_platform_bus_probe);
  * of_platform_populate() - Populate platform_devices from device tree data
  * @root: parent of the first level to probe or NULL for the root of the tree
  * @matches: match table, NULL to use the default
+ * @lookup: auxdata table for matching id and platform_data with device nodes
  * @parent: parent to hook devices from, NULL for toplevel
  *
  * Similar to of_platform_bus_probe(), this function walks the device tree
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] devfreq_cooling: pass a pointer to devfreq in the power model callbacks

2016-06-20 Thread Javi Merino

Eduardo, Rui,

On Fri, Jun 03, 2016 at 10:25:31AM +0100, Javi Merino wrote:
> When the devfreq cooling device was designed, it was an oversight not to
> pass a pointer to the struct devfreq as the first parameters of the
> callbacks.  The design patterns of the kernel suggest it for a good
> reason.
> 
> By passing a pointer to struct devfreq, the driver can register one
> function that works with multiple devices.  With the current
> implementation, a driver that can work with multiple devices has to
> create multiple copies of the same function with different parameters so
> that each devfreq_cooling_device can use the appropriate one.  By
> passing a pointer to struct devfreq, the driver can identify which
> device it's referring to.
> 
> Cc: Zhang Rui 
> Cc: Eduardo Valentin 
> Reviewed-by: Punit Agrawal 
> Signed-off-by: Javi Merino 
> ---
>  drivers/thermal/devfreq_cooling.c | 5 +++--
>  include/linux/devfreq_cooling.h   | 6 --
>  2 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/thermal/devfreq_cooling.c 
> b/drivers/thermal/devfreq_cooling.c
> index 01f0015f80dc..c549d83a0c7d 100644
> --- a/drivers/thermal/devfreq_cooling.c
> +++ b/drivers/thermal/devfreq_cooling.c
> @@ -238,7 +238,7 @@ get_static_power(struct devfreq_cooling_device *dfc, 
> unsigned long freq)
>   return 0;
>   }
>  
> - return dfc->power_ops->get_static_power(voltage);
> + return dfc->power_ops->get_static_power(df, voltage);
>  }
>  
>  /**
> @@ -262,7 +262,8 @@ get_dynamic_power(struct devfreq_cooling_device *dfc, 
> unsigned long freq,
>   struct devfreq_cooling_power *dfc_power = dfc->power_ops;
>  
>   if (dfc_power->get_dynamic_power)
> - return dfc_power->get_dynamic_power(freq, voltage);
> + return dfc_power->get_dynamic_power(dfc->devfreq, freq,
> + voltage);
>  
>   freq_mhz = freq / 100;
>   power = (u64)dfc_power->dyn_power_coeff * freq_mhz * voltage * voltage;
> diff --git a/include/linux/devfreq_cooling.h b/include/linux/devfreq_cooling.h
> index 7adf6cc4b305..959714e93e5b 100644
> --- a/include/linux/devfreq_cooling.h
> +++ b/include/linux/devfreq_cooling.h
> @@ -37,8 +37,10 @@
>   *   @dyn_power_coeff * frequency * voltage^2
>   */
>  struct devfreq_cooling_power {
> - unsigned long (*get_static_power)(unsigned long voltage);
> - unsigned long (*get_dynamic_power)(unsigned long freq,
> + unsigned long (*get_static_power)(struct devfreq *devfreq,
> +   unsigned long voltage);
> + unsigned long (*get_dynamic_power)(struct devfreq *devfreq,
> +unsigned long freq,
>  unsigned long voltage);
>   unsigned long dyn_power_coeff;
>  };

If there are no objections, can you pick this for the next merge
window?

Thanks,
Javi

Re: [PATCH] of: thermal: Fixed governor at each thermal zone

2016-09-27 Thread Javi Merino

On Tue, Sep 27, 2016 at 09:46:57AM +0800, Zhang Rui wrote:
> On 一, 2016-09-19 at 10:18 +0900, Inhyuk Kang wrote:
> > It is necessary to be added governor at each thermal_zone.
> > Because some governors should be operated in the during the kernel
> > booting
> > in order to avoid heating problem.
> > 
> > Default governor cannot be covered all thermal zones policy because
> > some thermal zones want to apply different one.
> > For example, the power allocator governor operates differently with
> > step wise governor.
> > Hence, it is better to parse governor parameter from the device tree.
> > 
> > Signed-off-by: Inhyuk Kang 
> > 
> The patch looks okay to me.
> Eduardo, what do you think of this patch?

This has been proposed in the past[0] and Eduardo said no[1] (as did
Krzysztof Kozlowski and Mark Rutland)

[0] https://marc.info/?l=linux-kernel&m=143893141227189&w=4
[1] https://marc.info/?l=linux-pm&m=144649947022547&w=4

Cheers,
Javi

> > diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-
> > thermal.c
> > index b8e509c..382c440 100644
> > --- a/drivers/thermal/of-thermal.c
> > +++ b/drivers/thermal/of-thermal.c
> > @@ -970,6 +970,7 @@ int __init of_parse_thermal_zones(void)
> >     struct thermal_zone_device *zone;
> >     struct thermal_zone_params *tzp;
> >     int i, mask = 0;
> > +   const char *governor;
> >     u32 prop;
> >  
> >     tz = thermal_of_build_thermal_zone(child);
> > @@ -996,6 +997,9 @@ int __init of_parse_thermal_zones(void)
> >     if (!of_property_read_u32(child, "sustainable-
> > power", &prop))
> >     tzp->sustainable_power = prop;
> >  
> > +   if (!of_property_read_string(child, "governor-name", 
> > &governor))
> > +   strcpy(tzp->governor_name, governor);
> > +
> >     for (i = 0; i < tz->ntrips; i++)
> >     mask |= 1 << i;
> >  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [PATCH] thermal: cpu_cooling: Fix wrong comment call function name

2016-09-08 Thread Javi Merino

On Wed, Sep 07, 2016 at 09:35:39AM +0900, Inhyuk Kang wrote:
> The last_load is updated not cpufreq_get_actual_power() function call
> but cpufreq_get_requested_power() function call.

Yep, my bad.  Thanks for fixing it!

> Signed-off-by: Inhyuk Kang 

Acked-by: Javi Merino 

> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index a32b417..9ce0e9e 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -74,7 +74,7 @@ struct power_table {
>   *   cpufreq frequencies.
>   * @allowed_cpus: all the cpus involved for this cpufreq_cooling_device.
>   * @node: list_head to link all cpufreq_cooling_device together.
> - * @last_load: load measured by the latest call to cpufreq_get_actual_power()
> + * @last_load: load measured by the latest call to 
> cpufreq_get_requested_power()
>   * @time_in_idle: previous reading of the absolute time that this cpu was 
> idle
>   * @time_in_idle_timestamp: wall time of the last invocation of
>   *   get_cpu_idle_time_us()
> -- 
> 1.9.1
>

Re: [PATCH 1/2] tracing, thermal: Hide devfreq trace events when not in use

2017-10-17 Thread Javi Merino

On Fri, Oct 13, 2017 at 10:21:50AM -0400, Steven Rostedt wrote:
> From: Steven Rostedt (VMware) 
> 
> As trace events when defined create data structures and functions to
> process them, defining trace events when not using them is a waste of
> memory.
> 
> The trace events thermal_power_devfreq_get_power and
> thermal_power_devfreq_limit are only used when CONFIG_DEVFREQ_THERMAL
> is set. Make those events only defined when that is set as well.
> 
> Signed-off-by: Steven Rostedt (VMware) 

Acked-by: Javi Merino 

> ---
> Index: linux-trace.git/include/trace/events/thermal.h
> ===
> --- linux-trace.git.orig/include/trace/events/thermal.h
> +++ linux-trace.git/include/trace/events/thermal.h
> @@ -148,6 +148,7 @@ TRACE_EVENT(thermal_power_cpu_limit,
>   __entry->power)
>  );
>  
> +#ifdef CONFIG_DEVFREQ_THERMAL
>  TRACE_EVENT(thermal_power_devfreq_get_power,
>   TP_PROTO(struct thermal_cooling_device *cdev,
>struct devfreq_dev_status *status, unsigned long freq,
> @@ -203,6 +204,7 @@ TRACE_EVENT(thermal_power_devfreq_limit,
>   __get_str(type), __entry->freq, __entry->cdev_state,
>   __entry->power)
>  );
> +#endif /* CONFIG_DEVFREQ_THERMAL */
>  #endif /* _TRACE_THERMAL_H */
>  
>  /* This part must be outside protection */

Re: [PATCH 2/2] tracing, thermal: Hide cpu cooling trace events when not in use

2017-10-17 Thread Javi Merino

On Fri, Oct 13, 2017 at 10:23:09AM -0400, Steven Rostedt wrote:
> From: Steven Rostedt (VMware) 
> 
> As trace events when defined create data structures and functions to
> process them, defining trace events when not using them is a waste of
> memory.
> 
> The trace events thermal_power_cpu_get_power and
> thermal_power_cpu_limit are only used when CONFIG_CPU_THERMAL is set.
> Make those events only defined when that is set as well.
> 
> Signed-off-by: Steven Rostedt (VMware) 

Acked-by: Javi Merino 

> ---
> Index: linux-trace.git/include/trace/events/thermal.h
> ===
> --- linux-trace.git.orig/include/trace/events/thermal.h
> +++ linux-trace.git/include/trace/events/thermal.h
> @@ -90,6 +90,7 @@ TRACE_EVENT(thermal_zone_trip,
>   show_tzt_type(__entry->trip_type))
>  );
>  
> +#ifdef CONFIG_CPU_THERMAL
>  TRACE_EVENT(thermal_power_cpu_get_power,
>   TP_PROTO(const struct cpumask *cpus, unsigned long freq, u32 *load,
>   size_t load_len, u32 dynamic_power, u32 static_power),
> @@ -147,6 +148,7 @@ TRACE_EVENT(thermal_power_cpu_limit,
>   __get_bitmask(cpumask), __entry->freq, __entry->cdev_state,
>   __entry->power)
>  );
> +#endif /* CONFIG_CPU_THERMAL */
>  
>  #ifdef CONFIG_DEVFREQ_THERMAL
>  TRACE_EVENT(thermal_power_devfreq_get_power,

Re: [PATCH] thermal: cpu_cooling: pr_err() strings should end with newlines

2017-10-25 Thread Javi Merino

On Tue, Oct 24, 2017 at 01:20:39PM +0530, Arvind Yadav wrote:
> pr_err() messages should end with a new-line to avoid other messages
> being concatenated.
> 
> Signed-off-by: Arvind Yadav 

FWIW,

Acked-by: Javi Merino 

> ---
>  drivers/thermal/cpu_cooling.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 908a801..dc63aba 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -696,7 +696,7 @@ static unsigned int find_next_max(struct 
> cpufreq_frequency_table *table,
>   bool first;
>  
>   if (IS_ERR_OR_NULL(policy)) {
> - pr_err("%s: cpufreq policy isn't valid: %p", __func__, policy);
> + pr_err("%s: cpufreq policy isn't valid: %p\n", __func__, 
> policy);
>   return ERR_PTR(-EINVAL);
>   }
>  
> -- 
> 1.9.1
>

Re: [PATCH v4 4/5] thermal: power_allocator: don't require tzp to be present for the thermal zone

2015-08-28 Thread Javi Merino

On Fri, Aug 28, 2015 at 03:18:20AM +0100, Daniel Kurtz wrote:
> On Wed, Aug 26, 2015 at 9:26 PM, Javi Merino  wrote:
> > Thermal zones created using thermal_zone_device_create() may not have
> > tzp.  As the governor gets its parameters from there, allocate it while
> > the governor is bound to the thermal zone so that it can operate in it.
> > In this case, tzp is freed when the thermal zone switches to another
> > governor.
> >
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Signed-off-by: Javi Merino 
> > ---
> >
> > While this would be easier to do by just ignoring the thermal zone if
> > there was no tzp, I think the approach in this patch provides a better
> > behavior.
> 
> Why?
> Just ignoring the thermal zone seems reasonable and simpler.

>From the developer point of view, I agree that it's simpler.  What I
want to avoid is the system integrator getting different behaviors
based on the presence of tzp when the thermal zone was created.  If
the integrator was to configure this from userspace, they would only
be able to do so if the thermal zone was created with tzp.  I don't
like this distinction, I prefer the consistency from the user point of
view that this patch gives.

Cheers,
Javi

> >  drivers/thermal/power_allocator.c | 32 +++-
> >  1 file changed, 27 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/thermal/power_allocator.c 
> > b/drivers/thermal/power_allocator.c
> > index 2dfb8ade4d1b..85ce0aac9a41 100644
> > --- a/drivers/thermal/power_allocator.c
> > +++ b/drivers/thermal/power_allocator.c
> > @@ -58,6 +58,8 @@ static inline s64 div_frac(s64 x, s64 y)
> >
> >  /**
> >   * struct power_allocator_params - parameters for the power allocator 
> > governor
> > + * @allocated_tzp: whether we have allocated tzp for this thermal zone 
> > and
> > + * it needs to be freed on unbind
> >   * @err_integral:  accumulated error in the PID controller.
> >   * @prev_err:  error in the previous iteration of the PID controller.
> >   * Used to calculate the derivative term.
> > @@ -70,6 +72,7 @@ static inline s64 div_frac(s64 x, s64 y)
> >   * controlling for.
> >   */
> >  struct power_allocator_params {
> > +   bool allocated_tzp;
> > s64 err_integral;
> > s32 prev_err;
> > int trip_switch_on;
> > @@ -530,8 +533,7 @@ static void allow_maximum_power(struct 
> > thermal_zone_device *tz)
> >   * Initialize the PID controller parameters and bind it to the thermal
> >   * zone.
> >   *
> > - * Return: 0 on success, -EINVAL if the thermal zone doesn't have tzp or 
> > -ENOMEM
> > - * if we ran out of memory.
> > + * Return: 0 on success, or -ENOMEM if we ran out of memory.
> >   */
> >  static int power_allocator_bind(struct thermal_zone_device *tz)
> >  {
> > @@ -539,13 +541,20 @@ static int power_allocator_bind(struct 
> > thermal_zone_device *tz)
> > struct power_allocator_params *params;
> > unsigned long control_temp;
> >
> > -   if (!tz->tzp)
> > -   return -EINVAL;
> > -
> > params = kzalloc(sizeof(*params), GFP_KERNEL);
> > if (!params)
> > return -ENOMEM;
> >
> > +   if (!tz->tzp) {
> > +   tz->tzp = kzalloc(sizeof(*tz->tzp), GFP_KERNEL);
> 
> Why bother to allocate this dummy struct?
> Can't we just leave tz->tzp as NULL, and do a NULL check where needed?
> 
> > +   if (!tz->tzp) {
> > +   ret = -ENOMEM;
> > +   goto free_params;
> > +   }
> > +
> > +   params->allocated_tzp = true;
> > +   }
> > +
> > if (!tz->tzp->sustainable_power)
> > dev_warn(&tz->device, "power_allocator: sustainable_power 
> > will be estimated\n");
> >
> > @@ -562,11 +571,24 @@ static int power_allocator_bind(struct 
> > thermal_zone_device *tz)
> > tz->governor_data = params;
> >
> > return 0;
> > +
> > +free_params:
> > +   kfree(params);
> > +
> > +   return ret;
> >  }
> >
> >  static void power_allocator_unbind(struct thermal_zone_device *tz)
> >  {
> > +   struct power_allocator_params *params = tz->governor_data;
> > +
> > dev_dbg(&tz->device, "Unbinding from thermal zone %d\n", tz->id);
> > +
> > +   if (params->allocated_tzp) {
> > +   kfree(tz->tzp);
> > +   tz->tzp = NULL;
> > +   }
> > +
> > kfree(tz->governor_data);
> > tz->governor_data = NULL;
> >  }
> > --
> > 1.9.1
> >
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] thermal: power_allocator: allocate with kcalloc what you free with kfree

2015-08-29 Thread Javi Merino

Hi Linus,

On Thu, Aug 27, 2015 at 04:49:37PM +0100, Javi Merino wrote:
> On Tue, Aug 25, 2015 at 07:22:35PM +0100, Javi Merino wrote:
> > Commit cf736ea6f902 ("thermal: power_allocator: do not use devm*
> > interfaces") forgot to change a devm_kcalloc() to just kcalloc(), but
> > it's corresponding devm_kfree() was changed to kfree().  Allocate with
> > kcalloc() to match the kfree().
> > 
> > Fixes: cf736ea6f902 ("thermal: power_allocator: do not use devm* 
> > interfaces")
> > Cc: Dmitry Torokhov 
> > Cc: Eduardo Valentin 
> > Cc: Zhang Rui 
> > Signed-off-by: Javi Merino 
> > ---
> > 
> > Can this be merged for 4.2, please?  I'm having memory problems with
> > 4.2-rc8 because of this.
> 
> Please merge this for 4.2 or revert cf736ea6f902 ("thermal:
> power_allocator: do not use devm* interfaces")

cf736ea6f902 ("thermal: power_allocator: do not use devm*
interfaces") was merged for 4.2-rc8.  It leaves an allocation of
memory with devm_kcalloc() that is then freed with kfree().  The
patch at the top of the thread[0][1] fixes this.  Can you either
merge this patch or revert cf736ea6f902 for 4.2?

[0] https://patchwork.kernel.org/patch/7072591/
[1] Message-ID: <1440526955-9860-1-git-send-email-javi.mer...@arm.com>

Thanks,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] Fixes for cpu cooling

2015-09-01 Thread Javi Merino

On Tue, Aug 25, 2015 at 07:53:48PM +0100, Javi Merino wrote:
> On Mon, Aug 17, 2015 at 07:21:41PM +0100, Javi Merino wrote:
> > Commit c36cf0717631 ("thermal: cpu_cooling: implement the power
> > cooling device API") introduced two bugs: a call to kcalloc() (that
> > might sleep) under RCU and not freeing the allocation when it's no
> > longer needed.  This series fixes both issues.
> > 
> > Javi Merino (2):
> >   thermal: cpu_cooling: don't call kcalloc() under rcu_read_lock
> >   thermal: cpu_cooling: free power table on error or when unregistering
> 
> Gentle ping

Another ping
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 4/5] devfreq_cooling: add trace information

2015-10-14 Thread Javi Merino

Hi Steve,

On Thu, Sep 10, 2015 at 06:19:28PM +0100, Steven Rostedt wrote:
> On Thu, 10 Sep 2015 18:09:31 +0100
> Javi Merino  wrote:
> 
> > Tracing is useful for debugging and performance tuning.  Add similar
> > traces to what's present in the cpu cooling device.
> > 
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Cc: Steven Rostedt 
> > Cc: Ingo Molnar 
> > Signed-off-by: Javi Merino 
> > ---
> >  drivers/thermal/devfreq_cooling.c |  6 +
> >  include/trace/events/thermal.h| 53 
> > +++
> >  2 files changed, 59 insertions(+)
> > 
> > diff --git a/drivers/thermal/devfreq_cooling.c 
> > b/drivers/thermal/devfreq_cooling.c
> > index a032c5d5c374..a27206815066 100644
> > --- a/drivers/thermal/devfreq_cooling.c
> > +++ b/drivers/thermal/devfreq_cooling.c
> > @@ -25,6 +25,8 @@
> >  #include 
> >  #include 
> >  
> > +#include 
> > +
> >  static DEFINE_MUTEX(devfreq_lock);
> >  static DEFINE_IDR(devfreq_idr);
> >  
> > @@ -293,6 +295,9 @@ static int devfreq_cooling_get_requested_power(struct 
> > thermal_cooling_device *cd
> > /* Get static power */
> > static_power = get_static_power(dfc, freq);
> >  
> > +   trace_thermal_power_devfreq_get_power(cdev, status, freq, dyn_power,
> > + static_power);
> > +
> > *power = dyn_power + static_power;
> >  
> > return 0;
> > @@ -348,6 +353,7 @@ static int devfreq_cooling_power2state(struct 
> > thermal_cooling_device *cdev,
> > break;
> >  
> > *state = i;
> > +   trace_thermal_power_devfreq_limit(cdev, freq, *state, power);
> 
> I'm curious, does changing the above to:
> 
>   trace_thermal_power_devfreq_limit(cdev, freq, i, power);
> 
> make the compiled code better?
> 
> A tracepoint does some whacky things, and gcc may not optimize this.
> 
> The rest looks fine to me.

Can I treat that last statement as an Acked-by?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 0/5] Let the power allocator thermal governor run on any thermal zone

2015-09-07 Thread Javi Merino

Relax the thermal governor requirements of sustainable_power and at
least two trip points so that it can be bound to any thermal zone.
Its behavior won't be optimal, it would be the best possible with the
data provided.

Changes since v4:
   - Fix crash when a thermal zone with no trip points has no
 get_trip_point_temp().  Reported by Daniel Kurtz.
   - s/estimate_controller_constants()/estimate_pid_constants()/g

Changes since v3:
   - Don't hardcode a value for sustainable power and re-estimate
 the PID controllers every time if no sustainable power is given
 as suggested by Eduardo Valentin.
   - power_actor_get_min_power() moved to a patch of its own.

Changes since v2:
  - Typos suggested by Daniel Kurtz

Changes since v1:
  - Let the power allocator governor operate if the thermal zone
doesn't have tzp as suggested by Chung-yih Wang

Javi Merino (5):
  thermal: Add a function to get the minimum power
  thermal: power_allocator: relax the requirement of a sustainable_power
in tzp
  thermal: power_allocator: relax the requirement of two passive trip   
 points
  thermal: power_allocator: don't require tzp to be present for the
thermal zone
  thermal: power_allocator: exit early if there are no cooling devices

 Documentation/thermal/power_allocator.txt |   2 +-
 drivers/thermal/power_allocator.c | 243 ++
 drivers/thermal/thermal_core.c|  28 
 include/linux/thermal.h   |   6 +
 4 files changed, 214 insertions(+), 65 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 4/5] thermal: power_allocator: don't require tzp to be present for the thermal zone

2015-09-07 Thread Javi Merino

Thermal zones created using thermal_zone_device_create() may not have
tzp.  As the governor gets its parameters from there, allocate it while
the governor is bound to the thermal zone so that it can operate in it.
In this case, tzp is freed when the thermal zone switches to another
governor.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Reviewed-by: Daniel Kurtz 
Signed-off-by: Javi Merino 
---

While this would be easier to do by just ignoring the thermal zone if
there was no tzp, I think the approach in this patch provides a more
consistent behavior for the system integrator as it doesn't make a
distinction between thermal zones created with tzp and those that don't.

 drivers/thermal/power_allocator.c | 32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 06e954cd81cc..78d589e7e65f 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -58,6 +58,8 @@ static inline s64 div_frac(s64 x, s64 y)
 
 /**
  * struct power_allocator_params - parameters for the power allocator governor
+ * @allocated_tzp: whether we have allocated tzp for this thermal zone and
+ * it needs to be freed on unbind
  * @err_integral:  accumulated error in the PID controller.
  * @prev_err:  error in the previous iteration of the PID controller.
  * Used to calculate the derivative term.
@@ -70,6 +72,7 @@ static inline s64 div_frac(s64 x, s64 y)
  * controlling for.
  */
 struct power_allocator_params {
+   bool allocated_tzp;
s64 err_integral;
s32 prev_err;
int trip_switch_on;
@@ -527,8 +530,7 @@ static void allow_maximum_power(struct thermal_zone_device 
*tz)
  * Initialize the PID controller parameters and bind it to the thermal
  * zone.
  *
- * Return: 0 on success, -EINVAL if the thermal zone doesn't have tzp or 
-ENOMEM
- * if we ran out of memory.
+ * Return: 0 on success, or -ENOMEM if we ran out of memory.
  */
 static int power_allocator_bind(struct thermal_zone_device *tz)
 {
@@ -536,13 +538,20 @@ static int power_allocator_bind(struct 
thermal_zone_device *tz)
struct power_allocator_params *params;
unsigned long control_temp;
 
-   if (!tz->tzp)
-   return -EINVAL;
-
params = kzalloc(sizeof(*params), GFP_KERNEL);
if (!params)
return -ENOMEM;
 
+   if (!tz->tzp) {
+   tz->tzp = kzalloc(sizeof(*tz->tzp), GFP_KERNEL);
+   if (!tz->tzp) {
+   ret = -ENOMEM;
+   goto free_params;
+   }
+
+   params->allocated_tzp = true;
+   }
+
if (!tz->tzp->sustainable_power)
dev_warn(&tz->device, "power_allocator: sustainable_power will 
be estimated\n");
 
@@ -563,11 +572,24 @@ static int power_allocator_bind(struct 
thermal_zone_device *tz)
tz->governor_data = params;
 
return 0;
+
+free_params:
+   kfree(params);
+
+   return ret;
 }
 
 static void power_allocator_unbind(struct thermal_zone_device *tz)
 {
+   struct power_allocator_params *params = tz->governor_data;
+
dev_dbg(&tz->device, "Unbinding from thermal zone %d\n", tz->id);
+
+   if (params->allocated_tzp) {
+   kfree(tz->tzp);
+   tz->tzp = NULL;
+   }
+
kfree(tz->governor_data);
tz->governor_data = NULL;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 3/5] thermal: power_allocator: relax the requirement of two passive trip points

2015-09-07 Thread Javi Merino

The power allocator governor currently requires that the thermal zone
has at least two passive trip points.  If there aren't, the governor
refuses to bind to the thermal zone.

This commit relaxes that requirement.  Now the governor will bind to all
thermal zones regardless of how many trip points they have.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/power_allocator.txt |   2 +-
 drivers/thermal/power_allocator.c | 101 +-
 2 files changed, 58 insertions(+), 45 deletions(-)

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
index c3797b529991..a1ce2235f121 100644
--- a/Documentation/thermal/power_allocator.txt
+++ b/Documentation/thermal/power_allocator.txt
@@ -4,7 +4,7 @@ Power allocator governor tunables
 Trip points
 ---
 
-The governor requires the following two passive trip points:
+The governor works optimally with the following two passive trip points:
 
 1.  "switch on" trip point: temperature above which the governor
 control loop starts operating.  This is the first passive trip
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 7fa6685f9c5b..06e954cd81cc 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -24,6 +24,8 @@
 
 #include "thermal_core.h"
 
+#define INVALID_TRIP -1
+
 #define FRAC_BITS 10
 #define int_to_frac(x) ((x) << FRAC_BITS)
 #define frac_to_int(x) ((x) >> FRAC_BITS)
@@ -61,6 +63,8 @@ static inline s64 div_frac(s64 x, s64 y)
  * Used to calculate the derivative term.
  * @trip_switch_on:first passive trip point of the thermal zone.  The
  * governor switches on when this trip point is crossed.
+ * If the thermal zone only has one passive trip point,
+ * @trip_switch_on should be INVALID_TRIP.
  * @trip_max_desired_temperature:  last passive trip point of the thermal
  * zone.  The temperature we are
  * controlling for.
@@ -432,43 +436,66 @@ unlock:
return ret;
 }
 
-static int get_governor_trips(struct thermal_zone_device *tz,
- struct power_allocator_params *params)
+/**
+ * get_governor_trips() - get the number of the two trip points that are key 
for this governor
+ * @tz:thermal zone to operate on
+ * @params:pointer to private data for this governor
+ *
+ * The power allocator governor works optimally with two trips points:
+ * a "switch on" trip point and a "maximum desired temperature".  These
+ * are defined as the first and last passive trip points.
+ *
+ * If there is only one trip point, then that's considered to be the
+ * "maximum desired temperature" trip point and the governor is always
+ * on.  If there are no passive or active trip points, then the
+ * governor won't do anything.  In fact, its throttle function
+ * won't be called at all.
+ */
+static void get_governor_trips(struct thermal_zone_device *tz,
+  struct power_allocator_params *params)
 {
-   int i, ret, last_passive;
+   int i, last_active, last_passive;
bool found_first_passive;
 
found_first_passive = false;
-   last_passive = -1;
-   ret = -EINVAL;
+   last_active = INVALID_TRIP;
+   last_passive = INVALID_TRIP;
 
for (i = 0; i < tz->trips; i++) {
enum thermal_trip_type type;
+   int ret;
 
ret = tz->ops->get_trip_type(tz, i, &type);
-   if (ret)
-   return ret;
+   if (ret) {
+   dev_warn(&tz->device,
+"Failed to get trip point %d type: %d\n", i,
+ret);
+   continue;
+   }
 
-   if (!found_first_passive) {
-   if (type == THERMAL_TRIP_PASSIVE) {
+   if (type == THERMAL_TRIP_PASSIVE) {
+   if (!found_first_passive) {
params->trip_switch_on = i;
found_first_passive = true;
+   } else  {
+   last_passive = i;
}
-   } else if (type == THERMAL_TRIP_PASSIVE) {
-   last_passive = i;
+   } else if (type == THERMAL_TRIP_ACTIVE) {
+   last_active = i;
} else {
break;
}
}
 
-   if (last_passive != -1) {
+   if (last_passive != INVALID_TRIP) {
params->trip_max_desired_temperature = last_passive;
-   ret = 0;
+   } else if (fou

[PATCH v5 1/5] thermal: Add a function to get the minimum power

2015-09-07 Thread Javi Merino

The thermal core already has a function to get the maximum power of a
cooling device: power_actor_get_max_power().  Add a function to get the
minimum power of a cooling device.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 drivers/thermal/thermal_core.c | 28 
 include/linux/thermal.h|  6 ++
 2 files changed, 34 insertions(+)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 4ca211be4c0f..760204f0b63c 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -997,6 +997,34 @@ int power_actor_get_max_power(struct 
thermal_cooling_device *cdev,
 }
 
 /**
+ * power_actor_get_min_power() - get the mainimum power that a cdev can consume
+ * @cdev:  pointer to &thermal_cooling_device
+ * @tz:a valid thermal zone device pointer
+ * @min_power: pointer in which to store the minimum power
+ *
+ * Calculate the minimum power consumption in milliwatts that the
+ * cooling device can currently consume and store it in @min_power.
+ *
+ * Return: 0 on success, -EINVAL if @cdev doesn't support the
+ * power_actor API or -E* on other error.
+ */
+int power_actor_get_min_power(struct thermal_cooling_device *cdev,
+ struct thermal_zone_device *tz, u32 *min_power)
+{
+   unsigned long max_state;
+   int ret;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   ret = cdev->ops->get_max_state(cdev, &max_state);
+   if (ret)
+   return ret;
+
+   return cdev->ops->state2power(cdev, tz, max_state, min_power);
+}
+
+/**
  * power_actor_set_power() - limit the maximum power that a cooling device can 
consume
  * @cdev:  pointer to &thermal_cooling_device
  * @instance:  thermal instance to update
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 037e9df2f610..f99d934d373a 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -384,6 +384,8 @@ static inline bool cdev_is_power_actor(struct 
thermal_cooling_device *cdev)
 
 int power_actor_get_max_power(struct thermal_cooling_device *,
  struct thermal_zone_device *tz, u32 *max_power);
+int power_actor_get_min_power(struct thermal_cooling_device *,
+ struct thermal_zone_device *tz, u32 *min_power);
 int power_actor_set_power(struct thermal_cooling_device *,
  struct thermal_instance *, u32);
 struct thermal_zone_device *thermal_zone_device_register(const char *, int, 
int,
@@ -419,6 +421,10 @@ static inline bool cdev_is_power_actor(struct 
thermal_cooling_device *cdev)
 static inline int power_actor_get_max_power(struct thermal_cooling_device 
*cdev,
  struct thermal_zone_device *tz, u32 *max_power)
 { return 0; }
+static inline int power_actor_get_min_power(struct thermal_cooling_device 
*cdev,
+   struct thermal_zone_device *tz,
+   u32 *min_power)
+{ return -ENODEV; }
 static inline int power_actor_set_power(struct thermal_cooling_device *cdev,
  struct thermal_instance *tz, u32 power)
 { return 0; }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 5/5] thermal: power_allocator: exit early if there are no cooling devices

2015-09-07 Thread Javi Merino

Don't waste cycles in the power allocator governor's throttle function
if there are no cooling devices and exit early.

This commit doesn't change any functionality, but should provide better
performance for the odd case of a thermal zone with trip points but
without cooling devices.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 drivers/thermal/power_allocator.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 78d589e7e65f..8a0d801ed29b 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -346,6 +346,11 @@ static int allocate_power(struct thermal_zone_device *tz,
}
}
 
+   if (!num_actors) {
+   ret = -ENODEV;
+   goto unlock;
+   }
+
/*
 * We need to allocate five arrays of the same size:
 * req_power, max_power, granted_power, extra_actor_power and
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 2/5] thermal: power_allocator: relax the requirement of a sustainable_power in tzp

2015-09-07 Thread Javi Merino

The power allocator governor currently requires that a sustainable power
is passed as part of the thermal zone's thermal zone parameters.  If
that parameter is not provided, it doesn't register with the thermal
zone.

While this parameter is strongly recommended for optimal performance, it
doesn't need to be mandatory.  Relax the requirement and allow the
governor to bind to thermal zones that don't provide it by estimating it
from the cooling devices' power model.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 drivers/thermal/power_allocator.c | 125 ++
 1 file changed, 100 insertions(+), 25 deletions(-)

diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 251676902869..7fa6685f9c5b 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -73,6 +73,88 @@ struct power_allocator_params {
 };
 
 /**
+ * estimate_sustainable_power() - Estimate the sustainable power of a thermal 
zone
+ * @tz: thermal zone we are operating in
+ *
+ * For thermal zones that don't provide a sustainable_power in their
+ * thermal_zone_params, estimate one.  Calculate it using the minimum
+ * power of all the cooling devices as that gives a valid value that
+ * can give some degree of functionality.  For optimal performance of
+ * this governor, provide a sustainable_power in the thermal zone's
+ * thermal_zone_params.
+ */
+static u32 estimate_sustainable_power(struct thermal_zone_device *tz)
+{
+   u32 sustainable_power = 0;
+   struct thermal_instance *instance;
+   struct power_allocator_params *params = tz->governor_data;
+
+   list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
+   struct thermal_cooling_device *cdev = instance->cdev;
+   u32 min_power;
+
+   if (instance->trip != params->trip_max_desired_temperature)
+   continue;
+
+   if (power_actor_get_min_power(cdev, tz, &min_power))
+   continue;
+
+   sustainable_power += min_power;
+   }
+
+   return sustainable_power;
+}
+
+/**
+ * estimate_pid_constants() - Estimate the constants for the PID controller
+ * @tz:thermal zone for which to estimate the constants
+ * @sustainable_power: sustainable power for the thermal zone
+ * @trip_switch_on:trip point number for the switch on temperature
+ * @control_temp:  target temperature for the power allocator governor
+ * @force: whether to force the update of the constants
+ *
+ * This function is used to update the estimation of the PID
+ * controller constants in struct thermal_zone_parameters.
+ * Sustainable power is provided in case it was estimated.  The
+ * estimated sustainable_power should not be stored in the
+ * thermal_zone_parameters so it has to be passed explicitly to this
+ * function.
+ *
+ * If @force is not set, the values in the thermal zone's parameters
+ * are preserved if they are not zero.  If @force is set, the values
+ * in thermal zone's parameters are overwritten.
+ */
+static void estimate_pid_constants(struct thermal_zone_device *tz,
+  u32 sustainable_power, int trip_switch_on,
+  unsigned long control_temp, bool force)
+{
+   int ret;
+   unsigned long switch_on_temp;
+   u32 temperature_threshold;
+
+   ret = tz->ops->get_trip_temp(tz, trip_switch_on, &switch_on_temp);
+   if (ret)
+   switch_on_temp = 0;
+
+   temperature_threshold = control_temp - switch_on_temp;
+
+   if (!tz->tzp->k_po || force)
+   tz->tzp->k_po = int_to_frac(sustainable_power) /
+   temperature_threshold;
+
+   if (!tz->tzp->k_pu || force)
+   tz->tzp->k_pu = int_to_frac(2 * sustainable_power) /
+   temperature_threshold;
+
+   if (!tz->tzp->k_i || force)
+   tz->tzp->k_i = int_to_frac(10) / 1000;
+   /*
+* The default for k_d and integral_cutoff is 0, so we can
+* leave them as they are.
+*/
+}
+
+/**
  * pid_controller() - PID controller
  * @tz:thermal zone we are operating in
  * @current_temp:  the current temperature in millicelsius
@@ -98,10 +180,20 @@ static u32 pid_controller(struct thermal_zone_device *tz,
 {
s64 p, i, d, power_range;
s32 err, max_power_frac;
+   u32 sustainable_power;
struct power_allocator_params *params = tz->governor_data;
 
max_power_frac = int_to_frac(max_allocatable_power);
 
+   if (tz->tzp->sustainable_power) {
+   sustainable_power = tz->tzp->sustainable_power;
+   } else {
+   sustainable_power = estimate_sustainable_power(tz);
+

Re: [PATCH v5 3/5] thermal: Add devfreq cooling

2015-09-09 Thread Javi Merino

On Wed, Sep 09, 2015 at 06:10:22AM +0100, Eduardo Valentin wrote:
> Hi

Hi Eduardo,

> On Thu, Aug 27, 2015 at 11:55:49AM +0100, Javi Merino wrote:
> > From: Ørjan Eide 
> > 
> > Add a generic thermal cooling device for devfreq, that is similar to
> > cpu_cooling.
> > 
> > The device must use devfreq.  In order to use the power extension of the
> > cooling device, it must have registered its OPPs using the OPP library.
> > 
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Signed-off-by: Javi Merino 
> > Signed-off-by: Ørjan Eide 
> 
> Thanks for taking this to upstream kernel.
> 
> Just minor comments as follows.
> 
> > ---
> > 
> > I had a look at 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep
> > problems in cpu_cooling").  It doesn't affect devfreq cooling because
> > we don't have notifiers, we only use locking for idr.
> 
> Thanks once again for checking it.
> 
> >  drivers/thermal/Kconfig   |  11 +
> >  drivers/thermal/Makefile  |   3 +
> >  drivers/thermal/devfreq_cooling.c | 546 
> > ++
> >  include/linux/devfreq_cooling.h   |  72 +
> >  4 files changed, 632 insertions(+)
> >  create mode 100644 drivers/thermal/devfreq_cooling.c
> >  create mode 100644 include/linux/devfreq_cooling.h
> > 
> > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> > index 118938ee8552..a2c6a6497804 100644
> > --- a/drivers/thermal/Kconfig
> > +++ b/drivers/thermal/Kconfig
> > @@ -147,6 +147,17 @@ config CLOCK_THERMAL
> >   device that is configured to use this cooling mechanism will be
> >   controlled to reduce clock frequency whenever temperature is high.
> >  
> > +config DEVFREQ_THERMAL
> > +   bool "Generic device cooling support"
> > +   depends on PM_DEVFREQ
> > +   depends on PM_OPP
> > +   help
> > + This implements the generic devfreq cooling mechanism through
> > + frequency reduction for devices using devfreq.
> > +
> > + This will throttle the device by limiting the maximum allowed DVFS
> > + frequency corresponding to the cooling level.
> > +
> >   If you want this support, you should say Y here.
> >  
> >  config THERMAL_EMULATION
> > diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> > index 535dfee1496f..45f26978ff74 100644
> > --- a/drivers/thermal/Makefile
> > +++ b/drivers/thermal/Makefile
> > @@ -22,6 +22,9 @@ thermal_sys-$(CONFIG_CPU_THERMAL) += cpu_cooling.o
> >  # clock cooling
> >  thermal_sys-$(CONFIG_CLOCK_THERMAL)+= clock_cooling.o
> >  
> > +# devfreq cooling
> > +thermal_sys-$(CONFIG_DEVFREQ_THERMAL) += devfreq_cooling.o
> > +
> >  # platform thermal drivers
> >  obj-$(CONFIG_QCOM_SPMI_TEMP_ALARM) += qcom-spmi-temp-alarm.o
> >  obj-$(CONFIG_SPEAR_THERMAL)+= spear_thermal.o
> > diff --git a/drivers/thermal/devfreq_cooling.c 
> > b/drivers/thermal/devfreq_cooling.c
> > new file mode 100644
> > index ..3d4abc746099
> > --- /dev/null
> > +++ b/drivers/thermal/devfreq_cooling.c
> > @@ -0,0 +1,546 @@
> > +/*
> > + * devfreq_cooling: Thermal cooling device implementation for devices using
> > + *  devfreq
> > + *
> > + * Copyright (C) 2014-2015 ARM Limited
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
> > + * kind, whether express or implied; without even the implied warranty
> > + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * TODO:
> > + *- If OPPs are added or removed after devfreq cooling has
> > + *  registered, the devfreq cooling won't react to it.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +static DEFINE_MUTEX(devfreq_lock);
> > +static DEFINE_IDR(devfreq_idr);
> > +
> > +/**
> > + * struct devfreq_cooling_device - Devfreq cooling device
> > + * @id:unique integer value corresponding to each
> > + * devfreq_cooling_device registered.
> > + * @cdev:  Pointer to associated thermal cooling device.
> &

Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling device registered

2015-10-14 Thread Javi Merino

On Mon, Oct 12, 2015 at 09:23:28AM +, Chen, Yu C wrote:
> Hi, Javi
> Sorry for my late response,
> 
> > -Original Message-
> > From: Javi Merino [mailto:javi.mer...@arm.com]
> > Sent: Wednesday, September 30, 2015 12:02 AM
> > To: Chen, Yu C
> > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui; linux-
> > ker...@vger.kernel.org; sta...@vger.kernel.org
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> > device registered
> > 
> > Hi Yu,
> > 
> > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > Hi, Javi,
> > >
> > > > -Original Message-
> > > > From: Javi Merino [mailto:javi.mer...@arm.com]
> > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > To: Chen, Yu C
> > > > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui;
> > > > linux- ker...@vger.kernel.org; sta...@vger.kernel.org
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > cooling device registered
> > > >
> > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > From: Zhang Rui 
> > > > >
> > > > >
> > > >
> > > > I think you need to hold cdev->lock here, to make sure that no
> > > > thermal zone is added or removed from cdev->thermal_instances while
> > you are looping.
> > > >
> > > Ah right, will add. If I add the cdev ->lock here, will there be a
> > > AB-BA lock with thermal_zone_unbind_cooling_device?
> > 
> > You're right, it could lead to a deadlock.  The locks can't be swapped 
> > because
> > that won't work in step_wise.
> > 
> > The best way that I can think of accessing thermal_instances atomically is 
> > by
> > making it RCU protected instead of with mutexes.
> > What do you think?
> > 
> RCU would need extra spinlocks to protect the list, and need to sync_rcu 
> after we delete
> one instance from thermal_instance list,  I think it is too complicated for 
> me to rewrite: (
> How about using thermal_list_lock instead of cdev ->lock?
> This guy should be big enough to protect the device.thermal_instance list.

thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but
it doesn't protect the thermal_instances list.  For example,
thermal_zone_bind_cooling_device() adds a cooling device to the
cdev->thermal_instances list without taking thermal_tz_list.

To sum up, you have to protect accessing the cdev->thermal_instances
list but with the current locking scheme, you would create an AB-BA
deadlock.  As I see it you would have to change the locking scheme to
either RCU or add a new mutex that protects the
cdev->thermal_instances and tz->thermal_instances lists and change all
accesses to them to make sure they comply with the new locking scheme.

Is there a better way of solving this?  Cheers,
Javi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling device registered

2015-10-15 Thread Javi Merino

On Wed, Oct 14, 2015 at 07:23:55PM +, Chen, Yu C wrote:
> > -Original Message-
> > From: Javi Merino [mailto:javi.mer...@arm.com]
> > Sent: Thursday, October 15, 2015 1:08 AM
> > To: Chen, Yu C
> > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui; linux-
> > ker...@vger.kernel.org; sta...@vger.kernel.org; Pandruvada, Srinivas
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> > device registered
> > 
> > On Mon, Oct 12, 2015 at 09:23:28AM +, Chen, Yu C wrote:
> > > Hi, Javi
> > > Sorry for my late response,
> > >
> > > > -Original Message-
> > > > From: Javi Merino [mailto:javi.mer...@arm.com]
> > > > Sent: Wednesday, September 30, 2015 12:02 AM
> > > > To: Chen, Yu C
> > > > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui;
> > > > linux- ker...@vger.kernel.org; sta...@vger.kernel.org
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > cooling device registered
> > > >
> > > > Hi Yu,
> > > >
> > > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > > > Hi, Javi,
> > > > >
> > > > > > -Original Message-
> > > > > > From: Javi Merino [mailto:javi.mer...@arm.com]
> > > > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > > > To: Chen, Yu C
> > > > > > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui;
> > > > > > linux- ker...@vger.kernel.org; sta...@vger.kernel.org
> > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > > > cooling device registered
> > > > > >
> > > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > > > From: Zhang Rui 
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > I think you need to hold cdev->lock here, to make sure that no
> > > > > > thermal zone is added or removed from cdev->thermal_instances
> > > > > > while
> > > > you are looping.
> > > > > >
> > > > > Ah right, will add. If I add the cdev ->lock here, will there be a
> > > > > AB-BA lock with thermal_zone_unbind_cooling_device?
> > > >
> > > > You're right, it could lead to a deadlock.  The locks can't be
> > > > swapped because that won't work in step_wise.
> > > >
> > > > The best way that I can think of accessing thermal_instances
> > > > atomically is by making it RCU protected instead of with mutexes.
> > > > What do you think?
> > > >
> > > RCU would need extra spinlocks to protect the list, and need to
> > > sync_rcu after we delete one instance from thermal_instance list,  I
> > > think it is too complicated for me to rewrite: ( How about using
> > thermal_list_lock instead of cdev ->lock?
> > > This guy should be big enough to protect the device.thermal_instance list.
> > 
> > thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but it
> > doesn't protect the thermal_instances list.  For example,
> > thermal_zone_bind_cooling_device() adds a cooling device to the
> > cdev->thermal_instances list without taking thermal_tz_list.
> > 
> Before thermal_zone_bind_cooling_device is invoked,
> the thermal_list_lock will be firstly gripped:
> 
> static void bind_cdev(struct thermal_cooling_device *cdev)
> {
> mutex_lock(&thermal_list_lock);
> either tz->ops->bind:   thermal_zone_bind_cooling_device
> or __bind()  :   thermal_zone_bind_cooling_device
> mutex_unlock(&thermal_list_lock);
> }
> 
> And it is the same as in  passive_store.
> So when code is trying to add/delete thermal_instance of cdev,
> he has already hold thermal_list_lock IMO. Or do I miss anything?

thermal_zone_bind_cooling_device() is exported, so you can't really
rely on the static thermal_list_lock being acquired in every single
call.

thermal_list_lock and protects the lists thermal_tz_list and
thermal_cdev_list.  Making it implicitly protect the cooling device's
and thermal zone device's instances list because no sensible code
would call thermal_zone_bind_cooling_device() outside of a bind
function is just asking for trouble.

Locking is hard to understand and easy to get wrong so let's keep it
simple.

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 0/5] Let the power allocator thermal governor run on any thermal zone

2015-09-02 Thread Javi Merino

On Wed, Aug 26, 2015 at 02:26:39PM +0100, Javi Merino wrote:
> Relax the thermal governor requirements of sustainable_power and at
> least two trip points so that it can be bound to any thermal zone.
> Its behavior won't be optimal, it would be the best it can with the
> data provided.
> 
> Changes since v3:
>- Don't hardcode a value for sustainable power and re-estimate
>  the PID controllers every time if no sustainable power is given
>  as suggested by Eduardo Valentin.
>- power_actor_get_min_power() moved to a patch of its own.
> 
> Changes since v2:
>   - Typos suggested by Daniel Kurtz
> 
> Changes since v1:
>   - Let the power allocator governor operate if the thermal zone
>     doesn't have tzp as suggested by Chung-yih Wang
> 
> Javi Merino (5):
>   thermal: Add a function to get the minimum power
>   thermal: power_allocator: relax the requirement of a sustainable_power
> in tzp
>   thermal: power_allocator: relax the requirement of two passive trip   
>  points
>   thermal: power_allocator: don't require tzp to be present for the
> thermal zone
>   thermal: power_allocator: exit early if there are no cooling devices
> 
>  Documentation/thermal/power_allocator.txt |   2 +-
>  drivers/thermal/power_allocator.c | 241 
> ++
>  drivers/thermal/thermal_core.c|  28 
>  include/linux/thermal.h   |   6 +
>  4 files changed, 212 insertions(+), 65 deletions(-)

Gentle ping.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling device registered

2015-10-20 Thread Javi Merino

Hi Yu,

On Tue, Oct 20, 2015 at 01:44:20AM +, Chen, Yu C wrote:
> > -Original Message-
> > From: Javi Merino [mailto:javi.mer...@arm.com]
> > Sent: Thursday, October 15, 2015 10:05 PM
> > To: Chen, Yu C
> > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui; linux- 
> > ker...@vger.kernel.org; sta...@vger.kernel.org; Pandruvada, Srinivas
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a 
> > cooling device registered
> > 
> > On Wed, Oct 14, 2015 at 07:23:55PM +, Chen, Yu C wrote:
> > > > -Original Message-
> > > > From: Javi Merino [mailto:javi.mer...@arm.com]
> > > > Sent: Thursday, October 15, 2015 1:08 AM
> > > > To: Chen, Yu C
> > > > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui;
> > > > linux- ker...@vger.kernel.org; sta...@vger.kernel.org; Pandruvada, 
> > > > Srinivas
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a 
> > > > cooling device registered
> > > >
> > > > On Mon, Oct 12, 2015 at 09:23:28AM +, Chen, Yu C wrote:
> > > > > Hi, Javi
> > > > > Sorry for my late response,
> > > > >
> > > > > > -Original Message-
> > > > > > From: Javi Merino [mailto:javi.mer...@arm.com]
> > > > > > Sent: Wednesday, September 30, 2015 12:02 AM
> > > > > > To: Chen, Yu C
> > > > > > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, Rui;
> > > > > > linux- ker...@vger.kernel.org; sta...@vger.kernel.org
> > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after 
> > > > > > a cooling device registered
> > > > > >
> > > > > > Hi Yu,
> > > > > >
> > > > > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > > > > > Hi, Javi,
> > > > > > >
> > > > > > > > -Original Message-
> > > > > > > > From: Javi Merino [mailto:javi.mer...@arm.com]
> > > > > > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > > > > > To: Chen, Yu C
> > > > > > > > Cc: linux...@vger.kernel.org; edubez...@gmail.com; Zhang, 
> > > > > > > > Rui;
> > > > > > > > linux- ker...@vger.kernel.org; sta...@vger.kernel.org
> > > > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update 
> > > > > > > > after a cooling device registered
> > > > > > > >
> > > > > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > > > > > From: Zhang Rui 
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think you need to hold cdev->lock here, to make sure 
> > > > > > > > that no thermal zone is added or removed from
> > > > > > > > cdev->thermal_instances while
> > > > > > you are looping.
> > > > > > > >
> > > > > > > Ah right, will add. If I add the cdev ->lock here, will 
> > > > > > > there be a AB-BA lock with thermal_zone_unbind_cooling_device?
> > > > > >
> > > > > > You're right, it could lead to a deadlock.  The locks can't be 
> > > > > > swapped because that won't work in step_wise.
> > > > > >
> > > > > > The best way that I can think of accessing thermal_instances 
> > > > > > atomically is by making it RCU protected instead of with mutexes.
> > > > > > What do you think?
> > > > > >
> > > > > RCU would need extra spinlocks to protect the list, and need to 
> > > > > sync_rcu after we delete one instance from thermal_instance 
> > > > > list, I think it is too complicated for me to rewrite: ( How 
> > > > > about using
> > > > thermal_list_lock instead of cdev ->lock?
> > > > > This guy should be big enough to protect the 
> > > > > device.thermal_instance
> > list.
> > > >
> > > > thermal_list_lock protects thermal_tz_list and thermal_cdev_list, 
> > > > but it doesn't protect the thermal_instances list.  For example,
> > >

Re: [PATCH] thermal: avoid division by zero in power allocator

2015-10-01 Thread Javi Merino

On Tue, Sep 29, 2015 at 09:33:30PM +0100, Andrew Morton wrote:
> On Mon, 28 Sep 2015 23:28:34 +0200 Andrea Arcangeli  
> wrote:
> 
> > During boot I get a div by zero Oops regression starting in v4.3-rc3.
> > 
> > ...
> >
> > --- a/drivers/thermal/power_allocator.c
> > +++ b/drivers/thermal/power_allocator.c
> > @@ -144,6 +144,16 @@ static void estimate_pid_constants(struct 
> > thermal_zone_device *tz,
> > switch_on_temp = 0;
> >  
> > temperature_threshold = control_temp - switch_on_temp;
> > +   /*
> > +* estimate_pid_constants() tries to find appropriate default
> > +* values for thermal zones that don't provide them. If a
> > +* system integrator has configured a thermal zone with two
> > +* passive trip points at the same temperature, that person
> > +* hasn't put any effort to set up the thermal zone properly
> > +* so just give up.
> > +*/
> > +   if (!temperature_threshold)
> > +   return;
> >  
> > if (!tz->tzp->k_po || force)
> > tz->tzp->k_po = int_to_frac(sustainable_power) /
> 
> a) Are we sure this won't leave tz->tzp fields uninitialized?

They will be all zeros.  That's good enough.

> b) I'm not understanding that code at all.  The "proportional" term
>in a PID controller is supposed to be proportional to the (desired -
>actual) difference (aka "the error").
> 
>But estimate_pid_constants() appears to be setting the
>"proportional" term to be proportional to 1/error!

estimate_pid_constants() calculate the constants that you use in the
PID algorithm.  Say:

k_p * error + k_i * integral_of_error + k_d * diff_of_error

This code is calculating a reasonable k_p, k_i and k_d when they are
not provided by the platform.

>Maybe a description of local `temperature_threshold' would help
>clue me in.

The `error' in the above definition is:

target_temperature - current_temperature

whereas `temperature_threshold' is:

`target_temperature' - `switch_on_temperature'

`switch_on_temperature' is the temperature above which the thermal
governor starts operating and throttling cpus (or whatever cooling
device is configured).

The `switch_on_temperature' and `target_temperature' are defined using
trip points.  A platform that sets two trip points to the same
temperature is not properly configured.  With Andrea's patch we
provide degraded behavior instead of crashing.  I agree with that
approach (hence my Reviewed-by, maybe it should be an Acked-by?).

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/7] Introduce thermal pressure

2018-10-09 Thread Javi Merino

On Tue, Oct 09, 2018 at 12:24:55PM -0400, Thara Gopinath wrote:
> Thermal governors can respond to an overheat event for a cpu by
> capping the cpu's maximum possible frequency. This in turn
> means that the maximum available compute capacity of the
> cpu is restricted. But today in linux kernel, in event of maximum
> frequency capping of a cpu, the maximum available compute
> capacity of the cpu is not adjusted at all. In other words, scheduler
> is unware maximum cpu capacity restrictions placed due to thermal
> activity.

Interesting, I would have sworn that I tested this years ago by
lowering the maximum frequency of a cpufreq domain, and the scheduler
reacted accordingly to the new maximum capacities of the cpus.

>   This patch series attempts to address this issue.
> The benefits identified are better task placement among available
> cpus in event of overheating which in turn leads to better
> performance numbers.
> 
> The delta between the maximum possible capacity of a cpu and
> maximum available capacity of a cpu due to thermal event can
> be considered as thermal pressure. Instantaneous thermal pressure
> is hard to record and can sometime be erroneous as there can be mismatch
> between the actual capping of capacity and scheduler recording it.
> Thus solution is to have a weighted average per cpu value for thermal
> pressure over time. The weight reflects the amount of time the cpu has
> spent at a capped maximum frequency. To accumulate, average and
> appropriately decay thermal pressure, this patch series uses pelt
> signals and reuses the available framework that does a similar
> bookkeeping of rt/dl task utilization.
> 
> Regarding testing, basic build, boot and sanity testing have been
> performed on hikey960 mainline kernel with debian file system.
> Further aobench (An occlusion renderer for benchmarking realworld
> floating point performance) showed the following results on hikey960
> with debain.
> 
> Result  Standard
> Standard
> (Time secs) Error   
> Deviation
> Hikey 960 - no thermal pressure applied 138.67  6.5211.52%
> Hikey 960 -  thermal pressure applied   122.37  5.7811.57%
> 
> Thara Gopinath (7):
>   sched/pelt: Add option to make load and util calculations frequency
> invariant
>   sched/pelt.c: Add support to track thermal pressure
>   sched: Add infrastructure to store and update instantaneous thermal
> pressure
>   sched: Initialize per cpu thermal pressure structure
>   sched/fair: Enable CFS periodic tick to update thermal pressure
>   sched/fair: update cpu_capcity to reflect thermal pressure
>   thermal/cpu-cooling: Update thermal pressure in case of a maximum
> frequency capping
> 
>  drivers/base/arch_topology.c  |  1 +
>  drivers/thermal/cpu_cooling.c | 20 -

thermal?  There are other ways in which the maximum frequency of a cpu
can be limited, for example from userspace via scaling_max_freq.

When something (anything) changes the maximum frequency of a cpufreq
policy, the scheduler should be notified.  I think this change should
be done in cpufreq instead to make it generic and not particular to
a given maximum frequency "capper".

Cheers,
Javi

Re: [RFC PATCH 6/7] sched/fair: update cpu_capcity to reflect thermal pressure

2018-10-09 Thread Javi Merino

On Tue, Oct 09, 2018 at 12:25:01PM -0400, Thara Gopinath wrote:
> cpu_capacity relflects the maximum available capacity of a cpu. Thermal
> pressure on a cpu means this maximum available capacity is reduced. This
> patch reduces the average thermal pressure for a cpu from its maximum
> available capacity so that cpu_capacity reflects the actual
> available capacity.
> 
> Signed-off-by: Thara Gopinath 
> ---
>  kernel/sched/fair.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7deb1d0..8651e55 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7497,6 +7497,7 @@ static unsigned long scale_rt_capacity(int cpu)
>  
>   used = READ_ONCE(rq->avg_rt.util_avg);
>   used += READ_ONCE(rq->avg_dl.util_avg);
> + used += READ_ONCE(rq->avg_thermal.load_avg);

IIUIC, you are treating thermal pressure as an artificial load on the
cpu.  If so, this sounds like a hard to maintain hack.  Thermal
pressure have different characteristics to utilization.  What happens
if thermal sets the cpu cooling state back to 0 because there is
thermal headroom again?  Do we keep adding this artificial load to the
cpu just because there was thermal pressure in the past and let it
decay as if it was cpu load?

Cheers,
Javi

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-09 Thread Javi Merino

Hi Viresh,

On Tue, Dec 09, 2014 at 01:59:39AM +, Viresh Kumar wrote:
> On 8 December 2014 at 19:52, Javi Merino  wrote:
> > Ok, changed it into:
> >
> > cpu = cpumask_any(&cpufreq_device->allowed_cpus);
> > dev = get_cpu_device(cpu);
> > if (!dev) {
> > dev_warn(&cpufreq_device->cool_dev->device,
> > "No cpu device for cpu %d\n", cpu);
> > ret = -EINVAL;
> > goto unlock;
> > }
> >
> > num_opps = dev_pm_opp_get_opp_count(dev);
> > if (num_opps <= 0) {
> > ret = (num_opps < 0)? num_opps : -EINVAL;
> > goto unlock;
> > }
> 
> And this might not work. This is what I said in the first reply.
> 
> So, a bit lengthy reply now :)
> 
> Every cpu has a device struct associated with it. When cpufreq
> core initializes drivers, they ask for mapping (initializing) the opps.
> At that point we pass policy->cpu to opp core. OPP core doesn't
> know which cores share clock line (I am trying to solve that [1]) and
> so it just initializes the OPPs for policy->cpu. Let us say it cpuX.
> 
> Now there will be few more CPUs which are going to share clock
> line with it and hence will use the same OPPs. In thermal core,
> you got clip_cpus which is exactly the masks of all these CPUs
> sharing clock line.
> 
> If the OPP layer is good enough, then above code can work. But
> because right now the OPPs are mapped to just cpuX, passing
> any other cpu from clip_cpus will fail as it doesn't have any associated
> OPPs.
> 
> Now what I asked you is to use the CPU for which
> __cpufreq_cooling_register() is called. Normally we are calling
> __cpufreq_cooling_register() for the CPU for which OPPs are
> registered (but people might call it up for other CPUs as well)..

Sorry but I don't follow.  __cpufreq_cooling_register() is passed a
clip_cpus mask, not a single cpu.  How do I get "the cpu for which
__cpufreq_cooling_register() is called" if not by looping through all
the cpus in the mask?
 
> So, using that cpu *might* have worked here.
> 
> Now the earlier loop you used was good to get this information,
> but it wasn't consistent and so I objected.
> 
> What you should do:
> 
> - Create another routine to find the cpu for which OPPs are bound
> to
> -  And save the cpu_dev for it in the global struct for cpu_cooling

This I have done, it wasn't part of the snip that I sent.

> - reuse it wherever required.

Same as above.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-09 Thread Javi Merino

On Tue, Dec 09, 2014 at 10:36:46AM +, Viresh Kumar wrote:
> On 9 December 2014 at 16:02, Javi Merino  wrote:
> > Sorry but I don't follow.  __cpufreq_cooling_register() is passed a
> > clip_cpus mask, not a single cpu.  How do I get "the cpu for which
> > __cpufreq_cooling_register() is called" if not by looping through all
> > the cpus in the mask?
> 
> Yeah, its np that is passed instead of cpu number. So, that wouldn't
> be usable. Also because of the limitations I explained earlier, it makes
> sense to iterate over all clip_cpus and finding which one owns OPPs.

Ok, how about this then?  I've pasted the whole commit so as to avoid
confusion.

diff --git a/Documentation/thermal/cpu-cooling-api.txt 
b/Documentation/thermal/cpu-cooling-api.txt
index fca24c931ec8..d438a900e374 100644
--- a/Documentation/thermal/cpu-cooling-api.txt
+++ b/Documentation/thermal/cpu-cooling-api.txt
@@ -25,8 +25,150 @@ the user. The registration APIs returns the cooling device 
pointer.
 
clip_cpus: cpumask of cpus where the frequency constraints will happen.
 
-1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
+1.1.2 struct thermal_cooling_device *cpufreq_power_cooling_register(
+const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_cooling_register, this function registers a cpufreq
+cooling device.  Using this function, the cooling device will
+implement the power extensions by using a simple cpu power model.  The
+cpus must have registered their OPPs using the OPP library.
+
+The additional parameters are needed for the power model (See 2. Power
+models).  "capacitance" is the dynamic power coefficient (See 2.1
+Dynamic power).  "plat_static_func" is a function to calculate the
+static power consumed by these cpus (See 2.2 Static power).
+
+1.1.3 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
+struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_power_cooling_register, this function register a
+cpufreq cooling device with power extensions using the device tree
+information supplied by the np parameter.
+
+1.1.4 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
 This interface function unregisters the "thermal-cpufreq-%x" cooling 
device.
 
 cdev: Cooling device pointer which has to be unregistered.
+
+2. Power models
+
+The power API registration functions provide a simple power model for
+CPUs.  The current power is calculated as dynamic + (optionally)
+static power.  This power model requires that the operating-points of
+the CPUs are registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using the `cpufreq-cpu0.c` driver then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
+and `of_cpufreq_power_cooling_register()` is optional.  If you don't
+provide it, only dynamic power will be considered.
+
+2.1 Dynamic power
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+2.2 Static power
+
+Static leakage p

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-09 Thread Javi Merino

On Tue, Dec 09, 2014 at 11:06:46AM +, Viresh Kumar wrote:
> On 9 December 2014 at 16:30, Javi Merino  wrote:
> > Ok, how about this then?  I've pasted the whole commit so as to avoid
> > confusion.
> 
> Yeah, the cpu_dev part looks fine now.

Great, thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v6 1/9] tracing: Add array printing helpers

2014-12-10 Thread Javi Merino

On Mon, Dec 08, 2014 at 04:04:52PM +, Dave P Martin wrote:
> On Mon, Dec 08, 2014 at 03:42:10PM +, Steven Rostedt wrote:
> > On Fri,  5 Dec 2014 19:04:12 +0000
> > "Javi Merino"  wrote:
> 
> [...]
> 
> > > +
> > > +DEFINE_PRINT_ARRAY(u8, unsigned int, "0x%x");
> > > +DEFINE_PRINT_ARRAY(u16, unsigned int, "0x%x");
> > > +DEFINE_PRINT_ARRAY(u32, unsigned int, "0x%x");
> > > +DEFINE_PRINT_ARRAY(u64, unsigned long long, "0x%llx");
> > > +
> > 
> > I would really like to avoid adding a bunch of macros for each type.
> > Can't we have something like this:
> > ftrace_print_array(struct trace_seq *p, void *buf, int buf_len, 
> > int size)
> > {
> > char *prefix = "";
> > void *ptr = buf;
> > 
> > while (ptr < buf + buf_len) {
> > switch(size) {
> > case 8:
> > trace_seq_printf("%s0x%x", prefix,
> > *(unsigned char *)ptr);
> 
> I think this should be *(u8 *) etc.

Done, see below.

> Otherwise, I don't have a problem with this approach.  It's less
> ugly than my original.

It makes the lib traceevent patches uglier though ;)

> > break;
> > case 16:
> > trace_seq_printf("%s0x%x", prefix,
> > *(unsigned short *)ptr);
> > break;
> > case 32:
> > trace_seq_printf("%s0x%x", prefix,
> > *(unsigned int *)ptr);
> > break;
> > case 64:
> > trace_seq_printf("%s0x%llx", prefix,
> > *(unsigned long long *)ptr);
> > break;
> > default:
> > BUG();
> > }
> > prefix = ",";
> > ptr += size;
> > }
> > 
> > }
> > 
> > We probably could even make the "BUG()" into a build bug, with a little
> > work.
> 
> That sounds possible.

The only way I can think of doing that is by moving the check to the
__print_array macro:

#define __print_array(array, count, el_size)\
({  \
BUILD_BUG_ON(el_size != 8 && el_size != 16 && el_size != 32 && 
el_size != 64); \
ftrace_print_array_seq(p, array, count, el_size); \
})

Is this what you have in mind?

> Javi?

What about this?

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 28672e87e910..d5bddb230ecd 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,9 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_array_seq(struct trace_seq *p,
+   const void *buf, int buf_len, size_t el_size);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 26b4f2e13275..38c5f91f63da 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,10 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_array
+#define __print_array(array, count, el_size)   \
+   ftrace_print_array_seq(p, array, count, el_size)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -676,6 +680,7 @@ static inline void ftrace_test_probe_##call(void)   
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index c6977d5a9b12..b582261086e8 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -186,6 +186,48 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  size_t el_size)
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = "";
+   void *ptr = (void *)buf;
+
+

Re: [RFC PATCH v6 5/9] thermal: extend the cooling device API to include power information

2015-01-05 Thread Javi Merino

On Tue, Dec 23, 2014 at 03:14:11PM +, Eduardo Valentin wrote:
> Hi Javi
> 
> On Fri, Dec 05, 2014 at 07:04:16PM +0000, Javi Merino wrote:
> > Add three optional callbacks to the cooling device interface to allow
> > them to express power.  In addition to the callbacks, add helpers to
> > identify cooling devices that implement the power cooling device API.
> > 
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Signed-off-by: Javi Merino 
> > ---
> >  Documentation/thermal/power_allocator.txt | 27 ++
> >  drivers/thermal/thermal_core.c| 38 
> > +++
> >  include/linux/thermal.h   | 12 ++
> >  3 files changed, 77 insertions(+)
> >  create mode 100644 Documentation/thermal/power_allocator.txt
> > 
> > diff --git a/Documentation/thermal/power_allocator.txt 
> > b/Documentation/thermal/power_allocator.txt
> > new file mode 100644
> > index ..d3bb79050c27
> > --- /dev/null
> > +++ b/Documentation/thermal/power_allocator.txt
> > @@ -0,0 +1,27 @@
> > +Cooling device power API
> > +
> 
> Readers of this file need extra context here, IMO.

Patch 7 adds text before and after this section that provides that
context.

> > +
> > +Cooling devices controlled by this governor must supply the additional
> 
> What governor? the files says power allocator, and the title says,
> cooling device power API...

Correct, because that's added in the patch that introduces the power
allocator governor.  Therefore, it's not a problem for the readers of
this file but for the readers of the patches.  I can move this hunk to
patch 7 and introduce all the documentation at once if you think
that's clearer.

> > +"power" API in their `cooling_device_ops`.  It consists on three ops:
> > +
> 
> 
> 
> > +1. u32 get_actual_power(struct thermal_cooling_device *cdev);
> > +@cdev: The `struct thermal_cooling_device` pointer
> > +
> > +`get_actual_power()` returns the power currently consumed by the
> > +device in milliwatts.
> > +
> > +2. u32 state2power(struct thermal_cooling_device *cdev, unsigned long
> > +state);
> > +@cdev: The `struct thermal_cooling_device` pointer
> > +@state: A cooling device state
> > +
> > +Convert cooling device state @state into power consumption in
> > +milliwatts.
> > +
> > +3. unsigned long power2state(struct thermal_cooling_device *cdev,
> > +u32 power);
> > +@cdev: The `struct thermal_cooling_device` pointer
> > +@power: power in milliwatts
> > +
> > +Calculate a cooling device state that would make the device consume at
> > +most @power mW.
> 
> I believe it would be more helpful if you could provide extra context in
> which the above functions are called, and for what.

Ok, will do.

> > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > index 9021cb72a13a..c490f262ea7f 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -866,6 +866,44 @@ emul_temp_store(struct device *dev, struct 
> > device_attribute *attr,
> >  static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
> >  #endif/*CONFIG_THERMAL_EMULATION*/
> >  
> > +/**
> > + * power_actor_get_max_power() - get the maximum power that a cdev can 
> > consume
> > + * @cdev:  pointer to &thermal_cooling_device
> > + *
> > + * Calculate the maximum power consumption in milliwats that the
> > + * cooling device can currently consume.  If @cdev doesn't support the
> > + * power_actor API, this function returns 0.
> > + */
> > +u32 power_actor_get_max_power(struct thermal_cooling_device *cdev)
> > +{
> > +   if (!cdev_is_power_actor(cdev))
> > +   return 0;
> > +
> > +   return cdev->ops->state2power(cdev, 0);
> > +}
> > +
> > +/**
> > + * power_actor_set_power() - limit the maximum power that a cooling device 
> > can consume
> > + * @cdev:  pointer to &thermal_cooling_device
> > + * @power: the power in milliwatts
> > + *
> > + * Set the cooling device to consume at most @power milliwatts.
> > + *
> > + * Returns: 0 on success, -EINVAL if the cooling device does not
> > + * implement the power actor API or -E* for other failures.
> > + */
> > +int power_actor_set_power(struct thermal_cooling_device *cdev, u32 power)
> > +{
> > +   unsigned long state;
> > +
> > +   if (!cdev_is_power_actor(cdev))
> > +

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2015-01-05 Thread Javi Merino

On Fri, Jan 02, 2015 at 02:37:23PM +, Eduardo Valentin wrote:
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> Hello Javi,
> 
> Looks like the charset seams to be scrambled. Anyways, I will attempt to
> send a couple of feedback here..

Yes, some SMTP servers here are known to do that and I was using the
wrong one.  Sorry for that, it should not happen again.

> On Tue, Dec 09, 2014 at 11:00:43AM +, Javi Merino wrote:
> > On Tue, Dec 09, 2014 at 10:36:46AM +, Viresh Kumar wrote:
> > > On 9 December 2014 at 16:02, Javi Merino  wrote:
> > > > Sorry but I don't follow.  __cpufreq_cooling_register() is passed a
> > > > clip_cpus mask, not a single cpu.  How do I get "the cpu for which
> > > > __cpufreq_cooling_register() is called" if not by looping through all
> > > > the cpus in the mask?
> > >=20
> > > Yeah, its np that is passed instead of cpu number. So, that wouldn't
> > > be usable. Also because of the limitations I explained earlier, it makes
> > > sense to iterate over all clip_cpus and finding which one owns OPPs.
> >=20
> > Ok, how about this then?  I've pasted the whole commit so as to avoid
> > confusion.
> 
> I should consider this one as V7 of this patch, probably..
> 
> >=20
> > diff --git a/Documentation/thermal/cpu-cooling-api.txt b/Documentation/th=
> ermal/cpu-cooling-api.txt
> > index fca24c931ec8..d438a900e374 100644
> > --- a/Documentation/thermal/cpu-cooling-api.txt
> > +++ b/Documentation/thermal/cpu-cooling-api.txt
> > @@ -25,8 +25,150 @@ the user. The registration APIs returns the cooling d=
> evice pointer.
> > =20
> > clip_cpus: cpumask of cpus where the frequency constraints will happe=
> n.
> > =20
> > -1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cde=
> v)
> > +1.1.2 struct thermal_cooling_device *cpufreq_power_cooling_register(
> > +const struct cpumask *clip_cpus, u32 capacitance,
> > +get_static_t plat_static_func)
> > +
> > +Similar to cpufreq_cooling_register, this function registers a cpufreq
> > +cooling device.  Using this function, the cooling device will
> > +implement the power extensions by using a simple cpu power model.  The
> > +cpus must have registered their OPPs using the OPP library.
> > +
> > +The additional parameters are needed for the power model (See 2. Power
> > +models).  "capacitance" is the dynamic power coefficient (See 2.1
> > +Dynamic power).  "plat_static_func" is a function to calculate the
> > +static power consumed by these cpus (See 2.2 Static power).
> > +
> > +1.1.3 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
> > +struct device_node *np, const struct cpumask *clip_cpus, u32 capacit=
> ance,
> > +get_static_t plat_static_func)
> > +
> > +Similar to cpufreq_power_cooling_register, this function register a
> > +cpufreq cooling device with power extensions using the device tree
> > +information supplied by the np parameter.
> > +
> > +1.1.4 void cpufreq_cooling_unregister(struct thermal_cooling_device *cde=
> v)
> > =20
> >  This interface function unregisters the "thermal-cpufreq-%x" cooling=
>  device.
> > =20
> >  cdev: Cooling device pointer which has to be unregistered.
> > +
> > +2. Power models
> > +
> > +The power API registration functions provide a simple power model for
> > +CPUs.  The current power is calculated as dynamic + (optionally)
> > +static power.  This power model requires that the operating-points of
> > +the CPUs are registered using the kernel's opp library and the
> > +`cpufreq_frequency_table` is assigned to the `struct device` of the
> > +cpu.  If you are using the `cpufreq-cpu0.c` driver then the
> 
> cpufreq-cpu0.c is the old version of cpufreq-dt.c, right? I would
> suggest using CONFIG_* names instead of file names though.

Ok.

> > +`cpufreq_frequency_table` should already be assigned to the cpu
> > +device.
> > +
> > +The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
> > +and `of_cpufreq_power_cooling_register()` is optional.  If you don't
> > +provide it, only dynamic power will be considered.
> > +
> > +2.1 Dynamic power
> > +
> > +The dynamic power consumption of a processor depends on many factors.
> > +For a given processor implementation the primary factors are:
> > +
> > +- The time the processor spends running, consuming dynamic p

Re: [RFC PATCH v6 9/9] of: thermal: Introduce sustainable power for a thermal zone

2015-01-06 Thread Javi Merino

On Fri, Jan 02, 2015 at 03:53:00PM +, Eduardo Valentin wrote:
> On Fri, Dec 05, 2014 at 07:04:20PM +0000, Javi Merino wrote:
> > From: Punit Agrawal 
> > 
> > Introduce an optional property called, sustainable-power, which
> > represents the power (in mW) which the thermal zone can safely
> > dissipate.
> > 
> > If provided the property is parsed and associated with the thermal
> > zone via the thermal zone parameters.
> > 
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Signed-off-by: Punit Agrawal 
> > ---
> >  Documentation/devicetree/bindings/thermal/thermal.txt | 4 
> >  drivers/thermal/of-thermal.c  | 4 
> >  2 files changed, 8 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
> > b/Documentation/devicetree/bindings/thermal/thermal.txt
> > index f5db6b72a36f..c6eb9a8d2aed 100644
> > --- a/Documentation/devicetree/bindings/thermal/thermal.txt
> > +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
> > @@ -167,6 +167,10 @@ Optional property:
> > by means of sensor ID. Additional coefficients are
> > interpreted as constant offset.
> >  
> > +- sustainable-power:   An estimate of the sustainable power (in mW) 
> > that the
> > +  Type: unsigned   thermal zone can dissipate.
> > +  Size: one cell
> > +
> 
> Please, include examples of this property, as you mentioned in the
> governor documentation.

I'd rather put a pointer to the documentation instead of repeating the
same thing here.  What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v6 5/9] thermal: extend the cooling device API to include power information

2015-01-06 Thread Javi Merino

Hi Eduardo,

On Mon, Jan 05, 2015 at 09:04:09PM +, Eduardo Valentin wrote:
> On Mon, Jan 05, 2015 at 03:37:10PM +0000, Javi Merino wrote:
> > On Tue, Dec 23, 2014 at 03:14:11PM +, Eduardo Valentin wrote:
> > > Hi Javi
> > > 
> > > On Fri, Dec 05, 2014 at 07:04:16PM +, Javi Merino wrote:
> > > > Add three optional callbacks to the cooling device interface to allow
> > > > them to express power.  In addition to the callbacks, add helpers to
> > > > identify cooling devices that implement the power cooling device API.
> > > > 
> > > > Cc: Zhang Rui 
> > > > Cc: Eduardo Valentin 
> > > > Signed-off-by: Javi Merino 
> > > > ---
> > > >  Documentation/thermal/power_allocator.txt | 27 ++
> > > >  drivers/thermal/thermal_core.c| 38 
> > > > +++
> > > >  include/linux/thermal.h   | 12 ++
> > > >  3 files changed, 77 insertions(+)
> > > >  create mode 100644 Documentation/thermal/power_allocator.txt
> > > > 
> > > > diff --git a/Documentation/thermal/power_allocator.txt 
> > > > b/Documentation/thermal/power_allocator.txt
> > > > new file mode 100644
> > > > index ..d3bb79050c27
> > > > --- /dev/null
> > > > +++ b/Documentation/thermal/power_allocator.txt
> > > > @@ -0,0 +1,27 @@
> > > > +Cooling device power API
> > > > +
> > > 
> > > Readers of this file need extra context here, IMO.
> > 
> > Patch 7 adds text before and after this section that provides that
> > context.
> > 
> > > > +
> > > > +Cooling devices controlled by this governor must supply the additional
> > > 
> > > What governor? the files says power allocator, and the title says,
> > > cooling device power API...
> > 
> > Correct, because that's added in the patch that introduces the power
> > allocator governor.  Therefore, it's not a problem for the readers of
> > this file but for the readers of the patches.  I can move this hunk to
> > patch 7 and introduce all the documentation at once if you think
> > that's clearer.
> > 
> 
> Thinking of the atomicity of each patch/commit, I would prefer you to
> move all documentation to a single patch then.

Ok, I'll move it to the patch that introduces the power allocator.

[...]
> > > > diff --git a/drivers/thermal/thermal_core.c 
> > > > b/drivers/thermal/thermal_core.c
> > > > index 9021cb72a13a..c490f262ea7f 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -866,6 +866,44 @@ emul_temp_store(struct device *dev, struct 
> > > > device_attribute *attr,
> > > >  static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
> > > >  #endif/*CONFIG_THERMAL_EMULATION*/
> > > >  
> > > > +/**
> > > > + * power_actor_get_max_power() - get the maximum power that a cdev can 
> > > > consume
> > > > + * @cdev:  pointer to &thermal_cooling_device
> > > > + *
> > > > + * Calculate the maximum power consumption in milliwats that the
> > > > + * cooling device can currently consume.  If @cdev doesn't support the
> > > > + * power_actor API, this function returns 0.
> > > > + */
> > > > +u32 power_actor_get_max_power(struct thermal_cooling_device *cdev)
> > > > +{
> > > > +   if (!cdev_is_power_actor(cdev))
> > > > +   return 0;
> > > > +
> > > > +   return cdev->ops->state2power(cdev, 0);
> > > > +}
> > > > +
> > > > +/**
> > > > + * power_actor_set_power() - limit the maximum power that a cooling 
> > > > device can consume
> > > > + * @cdev:  pointer to &thermal_cooling_device
> > > > + * @power: the power in milliwatts
> > > > + *
> > > > + * Set the cooling device to consume at most @power milliwatts.
> > > > + *
> > > > + * Returns: 0 on success, -EINVAL if the cooling device does not
> > > > + * implement the power actor API or -E* for other failures.
> > > > + */
> > > > +int power_actor_set_power(struct thermal_cooling_device *cdev, u32 
> > > > power)
> > > > +{
> > > > +   unsigned long state;
&

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2015-01-06 Thread Javi Merino

Hi Eduardo,

On Mon, Jan 05, 2015 at 08:44:53PM +, Eduardo Valentin wrote:
> On Mon, Jan 05, 2015 at 04:53:40PM +0000, Javi Merino wrote:
> > On Fri, Jan 02, 2015 at 02:37:23PM +, Eduardo Valentin wrote:
> > > On Tue, Dec 09, 2014 at 11:00:43AM +, Javi Merino wrote:
> > > > On Tue, Dec 09, 2014 at 10:36:46AM +, Viresh Kumar wrote:
> > > > > On 9 December 2014 at 16:02, Javi Merino  wrote:
> > > > diff --git a/Documentation/thermal/cpu-cooling-api.txt 
> > > > b/Documentation/th=
> > > ermal/cpu-cooling-api.txt
> > > > index fca24c931ec8..d438a900e374 100644
> > > > --- a/Documentation/thermal/cpu-cooling-api.txt
> > > > +++ b/Documentation/thermal/cpu-cooling-api.txt
[...]
> > > > +
> > > > +In this simplified representation our model becomes:
> > > > +
> > > > +Pdyn =3D Kd * Voltage^2 * Frequency * Utilisation
> > > > +
> > > > +Where Kd (capacitance) represents an indicative running time dynamic
> > > > +power coefficient in fundamental units of mW/MHz/uVolt^2
> > > > +
> > > 
> > > Do we have Kd (capacitance) reference values for ARM processors? Is it
> > > worth adding a few of them as an example table here?=20
> > 
> > The reference numbers correspond not only to a particular processor
> > (e.g. Cortex-A15) but to specific SoCs, as the implementation
> > technology plays a key role in this.  I'll see if we can share some
> > reference values for specific SoCs.
> 
> It does not need to be a extensive / exhaustive list. A small set of
> examples should do it.
> 
> > > Where does one find Kd values?
> > > 
> > > Just looking for pointers for platform driver writers (potential users
> > > of these APIs).
> > 
> > I understand your concern.  I'm afraid the best I can say here is "ask
> > the SoC vendor".
> 
> OK. Adding the above hint + a small set of examples should do it.

I'll do that then.

> > > > +2.2 Static power
> > > > +
> > > > +Static leakage power consumption depends on a number of factors.  For a
> > > > +given circuit implementation the primary factors are:
> > > > +
> > > > +- Time the circuit spends in each 'power state'
> > > > +- Temperature
> > > > +- Operating voltage
> > > > +- Process grade
> > > > +
> > > > +The time the circuit spends in each 'power state' for a given
> > > > +evaluation period at first order means OFF or ON.  However,
> > > > +'retention' states can also be supported that reduce power during
> > > > +inactive periods without loss of context.
> > > > +
> > > > +Note: The visibility of state entries to the OS can vary, according to
> > > > +platform specifics, and this can then impact the accuracy of a model
> > > > +based on OS state information alone.  It might be possible in some
> > > > +cases to extract more accurate information from system resources.
> > > > +
> > > > +The temperature, operating voltage and process 'grade' (slow to fast)
> > > > +of the circuit are all significant factors in static leakage power
> > > > +consumption.  All of these have complex relationships to static power.
> > > > +
> > > > +Circuit implementation specific factors include the chosen silicon
> > > > +process as well as the type, number and size of transistors in both
> > > > +the logic gates and any RAM elements included.
> > > > +
> > > > +The static power consumption modelling must take into account the
> > > > +power managed regions that are implemented.  Taking the example of an
> > > > +ARM processor cluster, the modelling would take into account whether
> > > > +each CPU can be powered OFF separately or if only a single power
> > > > +region is implemented for the complete cluster.
> > > > +
> > > > +In one view, there are others, a static power consumption model can
> > > > +then start from a set of reference values for each power managed
> > > > +region (e.g. CPU, Cluster/L2) in each state (e.g. ON, OFF) at an
> > > > +arbitrary process grade, voltage and temperature point.  These values
> > > > +are then scaled for all of the following: the time in each state, the
> > > > +process grade, the current temperature and the operating voltage.
> > > > +However, since both

Re: [RFC PATCH v6 7/9] thermal: introduce the Power Allocator governor

2015-01-06 Thread Javi Merino

Hi Eduardo,

On Fri, Jan 02, 2015 at 03:46:24PM +, Eduardo Valentin wrote:
> On Fri, Dec 05, 2014 at 07:04:18PM +0000, Javi Merino wrote:
> > The power allocator governor is a thermal governor that controls system
> > and device power allocation to control temperature.  Conceptually, the
> > implementation divides the sustainable power of a thermal zone among
> > all the heat sources in that zone.
> > 
> > This governor relies on "power actors", entities that represent heat
> > sources.  They can report current and maximum power consumption and
> > can set a given maximum power consumption, usually via a cooling
> > device.
> > 
> > The governor uses a Proportional Integral Derivative (PID) controller
> > driven by the temperature of the thermal zone.  The output of the
> > controller is a power budget that is then allocated to each power
> > actor that can have bearing on the temperature we are trying to
> > control.  It decides how much power to give each cooling device based
> > on the performance they are requesting.  The PID controller ensures
> > that the total power budget does not exceed the control temperature.
> > 
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Signed-off-by: Punit Agrawal 
> > Signed-off-by: Javi Merino 
> > ---
> >  Documentation/thermal/power_allocator.txt | 196 
> >  drivers/thermal/Kconfig   |  15 +
> >  drivers/thermal/Makefile  |   1 +
> >  drivers/thermal/power_allocator.c | 511 
> > ++
> >  drivers/thermal/thermal_core.c|   7 +-
> >  drivers/thermal/thermal_core.h|   8 +
> >  include/linux/thermal.h   |  40 ++-
> >  7 files changed, 774 insertions(+), 4 deletions(-)
> >  create mode 100644 drivers/thermal/power_allocator.c
> > 
> > diff --git a/Documentation/thermal/power_allocator.txt 
> > b/Documentation/thermal/power_allocator.txt
> > index d3bb79050c27..23b684afdc75 100644
> > --- a/Documentation/thermal/power_allocator.txt
> > +++ b/Documentation/thermal/power_allocator.txt
> > @@ -1,3 +1,172 @@
> > +Power allocator governor tunables
> > +=
> > +
> > +Trip points
> > +---
> > +
> > +The governor requires the following two passive trip points:
> > +
> > +1.  "switch on" trip point: temperature above which the governor
> > +control loop starts operating.
> > +2.  "desired temperature" trip point: it should be higher than the
> > +"switch on" trip point. It is the target temperature the governor
> > +is controlling for.
> > +
> > +PID Controller
> > +--
> > +
> > +The power allocator governor implements a
> > +Proportional-Integral-Derivative controller (PID controller) with
> > +temperature as the control input and power as the controlled output:
> > +
> > +P_max = k_p * e + k_i * err_integral + k_d * diff_err + 
> > sustainable_power
> > +
> > +where
> > +e = desired_temperature - current_temperature
> > +err_integral is the sum of previous errors
> > +diff_err = e - previous_error
> > +
> > +It is similar to the one depicted below:
> > +
> > +  k_d
> > +   |
> > +current_temp   |
> > + | v
> > + |+--+   +---+
> > + | +->| diff_err |-->| X |--+
> > + | |  +--+   +---+  |
> > + | ||  tdpactor
> > + | |  k_i   |   |
> > get_actual_power()
> > + | |   ||   || |
> > + | |   ||   || | 
> > ...
> > + v |   vv   vv v
> > +   +---+   |  +---+  +---++---+   +---+   +--+
> > +   | S |---+->| sum e |->| X |--->| S |-->| S |-->|power |
> > +   +---+   |  +---+  +---++---+   +---+   |allocation|
> > + ^ |^ +--+
> > + | ||| |
> > + | |+---+   || |
> > + |

Re: [RFC PATCH v6 9/9] of: thermal: Introduce sustainable power for a thermal zone

2015-01-06 Thread Javi Merino

On Tue, Jan 06, 2015 at 01:13:03PM +, Eduardo Valentin wrote:
> On Tue, Jan 06, 2015 at 09:42:15AM +0000, Javi Merino wrote:
> > On Fri, Jan 02, 2015 at 03:53:00PM +, Eduardo Valentin wrote:
> > > On Fri, Dec 05, 2014 at 07:04:20PM +, Javi Merino wrote:
> > > > From: Punit Agrawal 
> > > > 
> > > > Introduce an optional property called, sustainable-power, which
> > > > represents the power (in mW) which the thermal zone can safely
> > > > dissipate.
> > > > 
> > > > If provided the property is parsed and associated with the thermal
> > > > zone via the thermal zone parameters.
> > > > 
> > > > Cc: Zhang Rui 
> > > > Cc: Eduardo Valentin 
> > > > Signed-off-by: Punit Agrawal 
> > > > ---
> > > >  Documentation/devicetree/bindings/thermal/thermal.txt | 4 
> > > >  drivers/thermal/of-thermal.c  | 4 
> > > >  2 files changed, 8 insertions(+)
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
> > > > b/Documentation/devicetree/bindings/thermal/thermal.txt
> > > > index f5db6b72a36f..c6eb9a8d2aed 100644
> > > > --- a/Documentation/devicetree/bindings/thermal/thermal.txt
> > > > +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
> > > > @@ -167,6 +167,10 @@ Optional property:
> > > > by means of sensor ID. Additional coefficients 
> > > > are
> > > > interpreted as constant offset.
> > > >  
> > > > +- sustainable-power:   An estimate of the sustainable power (in mW) 
> > > > that the
> > > > +  Type: unsigned   thermal zone can dissipate.
> > > > +  Size: one cell
> > > > +
> > > 
> > > Please, include examples of this property, as you mentioned in the
> > > governor documentation.
> > 
> > I'd rather put a pointer to the documentation instead of repeating the
> > same thing here.  What do you think?
> 
> The point is that device tree and Linux are supposed to be independent
> entities. I would prefer if you could extend the explanation of what is
> 'sustainable power' in the above entry. On top of that, pick one of the
> existing examples and extend it to include the 'sustainable-power' property,
> with a comment explaining it, for instance.

Ok, I'll do that.

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v6 7/9] thermal: introduce the Power Allocator governor

2015-01-06 Thread Javi Merino

Hi Eduardo,

On Tue, Jan 06, 2015 at 02:18:36PM +, Eduardo Valentin wrote:
> On Tue, Jan 06, 2015 at 01:23:42PM +0000, Javi Merino wrote:
> > On Fri, Jan 02, 2015 at 03:46:24PM +, Eduardo Valentin wrote:
> > > On Fri, Dec 05, 2014 at 07:04:18PM +, Javi Merino wrote:
> > > > diff --git a/drivers/thermal/power_allocator.c 
> > > > b/drivers/thermal/power_allocator.c
> > > > new file mode 100644
> > > > index ..09e98991efbb
> > > > --- /dev/null
> > > > +++ b/drivers/thermal/power_allocator.c
> > > > @@ -0,0 +1,511 @@
> > > > +/*
> > > > + * A power allocator to manage temperature
> > > > + *
> > > > + * Copyright (C) 2014 ARM Ltd.
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or modify
> > > > + * it under the terms of the GNU General Public License version 2 as
> > > > + * published by the Free Software Foundation.
> > > > + *
> > > > + * This program is distributed "as is" WITHOUT ANY WARRANTY of any
> > > > + * kind, whether express or implied; without even the implied warranty
> > > > + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > > > + * GNU General Public License for more details.
> > > > + */
> > > > +
> > > > +#define pr_fmt(fmt) "Power allocator: " fmt
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#include "thermal_core.h"
> > > > +
> > > > +#define FRAC_BITS 10
> > > > +#define int_to_frac(x) ((x) << FRAC_BITS)
> > > > +#define frac_to_int(x) ((x) >> FRAC_BITS)
> > > > +
> > > > +/**
> > > > + * mul_frac() - multiply two fixed-point numbers
> > > > + * @x: first multiplicand
> > > > + * @y: second multiplicand
> > > > + *
> > > 
> > > If it is a kernel doc, needs a description.
> > 
> > Other parts of the kernel are more liberal in this regard,
> > specially fro trivial functions like this.  Also, kernel-doc creates a
> > documentation just fine:
> > 
> > $ scripts/kernel-doc -function mul_frac drivers/thermal/power_allocator.c | 
> > nroff -man
> > mul_frac(9) Kernel Hacker's Manual 
> > mul_frac(9)
> > 
> > 
> > 
> > NAME
> >mul_frac - multiply two fixed-point numbers
> > 
> > SYNOPSIS
> >s64 mul_frac (s64 x, s64 y);
> > 
> > ARGUMENTS
> >x   first multiplicand
> > 
> >y   second multiplicand
> > 
> > RETURN
> >the  result of multiplying two fixed-point numbers.  The result is 
> > also
> >a fixed-point number.
> > 
> > 
> > 
> > January 2015   mul_frac
> > mul_frac(9)
> > 
> > 
> > I'll add the long description if you want to, but this is not a
> > warning.
> > 
> 
> 
> As long as there is no kerneldoc warning/errors, I am fine taking it. I
> must confess I haven't run kerneldoc script in your patch as I got it
> with encoding scrambled, so I was just pointing the missing entries.

Ok, I'll make sure kernel-doc doesn't scream.

> > > > + * Return: the result of multiplying two fixed-point numbers.  The
> > > > + * result is also a fixed-point number.
> > > > + */
> > > > +static inline s64 mul_frac(s64 x, s64 y)
> > > > +{
> > > > +   return (x * y) >> FRAC_BITS;
> > > > +}
> > > > +
> > > > +enum power_allocator_trip_levels {
> > > > +   TRIP_SWITCH_ON = 0, /* Switch on PID controller */
> > > > +   TRIP_MAX_DESIRED_TEMPERATURE, /* Temperature we are controlling 
> > > > for */
> > > > +
> > > > +   THERMAL_TRIP_NUM,
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct power_allocator_params - parameters for the power allocator 
> > > > governor
> > > > + * @k_po:  Proportional parameter of the PID controller when 
> > > > overshooting
> > > > + * (i.e., when temperature is below the target)
> > > > + * @k_pu:  Proportional parameter of the PID controller when 
> > > > undershooting
> > > &g

[RFC PATCH v6 0/9] The power allocator thermal governor

2014-12-05 Thread Javi Merino

Hi linux-pm,

The power allocator governor allocates device power to control
temperature.  This requires transforming performance requests into
requested power, which we do with an extended cooling device API
introduced in patch 5 (thermal: extend the cooling device API to
include power information).  Patch 6 (thermal: cpu_cooling: implement
the power cooling device API) extends the cpu cooling device using a
simple power model.  The division of power between the cooling devices
ensures that power is allocated where it is needed the most, based on
the current workload.

Patches 1-3 adds array printing helpers to ftrace, which we then use
in patch 8.

Changes since v5:
  - Addressed Stephen's review of the trace patches.
  - Removed power actors and extended the cooling device interface
instead.
  - Let platforms override the power allocator governor parameters in
their thermal zone parameters

Changes since v4:
  - Add more tracing
  - Document some of the limitations of the power allocator governor
  - Export the power_actor API and move power_actor.h to include/linux

Changes since v3:
  - Use tz->passive to poll faster when the first trip point is hit.
  - Don't make a special directory for power_actors
  - Add a DT property for sustainable-power
  - Simplify the static power interface and pass the current thermal
zone in every power_actor_ops to remove the controversial
enum power_actor_types
  - Use locks with the actor_list list
  - Use cpufreq_get() to get the frequency of the cpu instead of
using the notifiers.
  - Remove the prompt for THERMAL_POWER_ACTOR_CPU when configuring
the kernel

Changes since v2:
  - Changed the PI controller into a PID controller
  - Added static power to the cpu power model
  - tz parameter max_dissipatable_power renamed to sustainable_power
  - Register the cpufreq cooling device as part of the
power_cpu_actor registration.

Changes since v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the cooling device interface with a separate power actor
API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Todo:
  - Expose thermal zone parameters in sysfs
  - Expose new governor parameters in device tree
  - Expose cooling device weights in device tree

Cheers,
Javi & Punit

Dave Martin (1):
  tracing: Add array printing helpers

Javi Merino (7):
  tools lib traceevent: Generalize numeric argument
  tools lib traceevent: Add support for __print_u{8,16,32,64}_array()
  thermal: let governors have private data for each thermal zone
  thermal: extend the cooling device API to include power information
  thermal: cpu_cooling: implement the power cooling device API
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor

Punit Agrawal (1):
  of: thermal: Introduce sustainable power for a thermal zone

 .../devicetree/bindings/thermal/thermal.txt|   4 +
 Documentation/thermal/cpu-cooling-api.txt  | 144 +-
 Documentation/thermal/power_allocator.txt  | 223 +
 drivers/thermal/Kconfig|  15 +
 drivers/thermal/Makefile   |   1 +
 drivers/thermal/cpu_cooling.c  | 455 +-
 drivers/thermal/of-thermal.c   |   4 +
 drivers/thermal/power_allocator.c  | 528 +
 drivers/thermal/thermal_core.c | 128 -
 drivers/thermal/thermal_core.h |   8 +
 include/linux/cpu_cooling.h|  49 +-
 include/linux/ftrace_event.h   |   9 +
 include/linux/thermal.h|  61 ++-
 include/trace/events/thermal_power_allocator.h | 138 ++
 include/trace/ftrace.h |  17 +
 kernel/trace/trace_output.c|  51 ++
 tools/lib/traceevent/event-parse.c |  88 +++-
 tools/lib/traceevent/event-parse.h |   8 +-
 18 files changed, 1886 insertions(+), 45 deletions(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/trace/events/thermal_power_allocator.h

-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v6 1/9] tracing: Add array printing helpers

2014-12-05 Thread Javi Merino

From: Dave Martin 

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print__array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Signed-off-by: Dave Martin 
---
 include/linux/ftrace_event.h |  9 
 include/trace/ftrace.h   | 17 +++
 kernel/trace/trace_output.c  | 51 
 3 files changed, 77 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 28672e87e910..415afc53fa51 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,15 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_u8_array_seq(struct trace_seq *p,
+ const u8 *buf, int count);
+const char *ftrace_print_u16_array_seq(struct trace_seq *p,
+  const u16 *buf, int count);
+const char *ftrace_print_u32_array_seq(struct trace_seq *p,
+  const u32 *buf, int count);
+const char *ftrace_print_u64_array_seq(struct trace_seq *p,
+  const u64 *buf, int count);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 26b4f2e13275..15bc5d417aea 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,19 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_u8_array
+#define __print_u8_array(array, count) \
+   ftrace_print_u8_array_seq(p, array, count)
+#undef __print_u16_array
+#define __print_u16_array(array, count)\
+   ftrace_print_u16_array_seq(p, array, count)
+#undef __print_u32_array
+#define __print_u32_array(array, count)\
+   ftrace_print_u32_array_seq(p, array, count)
+#undef __print_u64_array
+#define __print_u64_array(array, count)\
+   ftrace_print_u64_array_seq(p, array, count)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -676,6 +689,10 @@ static inline void ftrace_test_probe_##call(void)  
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_u8_array
+#undef __print_u16_array
+#undef __print_u32_array
+#undef __print_u64_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index c6977d5a9b12..4a6ee61f30b3 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -186,6 +186,57 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+static const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  bool (*iterator)(struct trace_seq *p, const char *prefix,
+   const void **buf, int *buf_len))
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = "";
+
+   trace_seq_putc(p, '{');
+
+   while (iterator(p, prefix, &buf, &buf_len))
+   prefix = ",";
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+
+#define DEFINE_PRINT_ARRAY(type, printk_type, format)  \
+static bool\
+ftrace_print_array_iterator_##type(struct trace_seq *p, const char *prefix, \
+  const void **buf, int *buf_len)  \
+{  \
+   const type *__src = *buf;   \
+

[RFC PATCH v6 4/9] thermal: let governors have private data for each thermal zone

2014-12-05 Thread Javi Merino

A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 drivers/thermal/thermal_core.c | 83 ++
 include/linux/thermal.h|  9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 43b90709585f..9021cb72a13a 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -75,6 +75,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor() - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz->governor && tz->governor->bind_to_tz) {
+   if (tz->governor->bind_to_tz(tz)) {
+   dev_err(&tz->device,
+   "governor %s failed to bind and the previous 
one (%s) failed to bind again, thermal zone %s has no governor\n",
+   failed_gov_name, tz->governor->name, tz->type);
+   tz->governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Return: 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz->governor && tz->governor->unbind_from_tz)
+   tz->governor->unbind_from_tz(tz);
+
+   if (new_gov && new_gov->bind_to_tz) {
+   ret = new_gov->bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov->name);
+
+   return ret;
+   }
+   }
+
+   tz->governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -107,8 +159,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos->tzp->governor_name;
 
-   if (!strncasecmp(name, governor->name, THERMAL_NAME_LENGTH))
-   pos->governor = governor;
+   if (!strncasecmp(name, governor->name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_err(&pos->device,
+   "Failed to set governor %s for thermal 
zone %s: %d\n",
+   governor->name, pos->type, ret);
+   }
}
 
mutex_unlock(&thermal_list_lock);
@@ -134,7 +193,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, &thermal_tz_list, node) {
if (!strncasecmp(pos->governor->name, governor->name,
THERMAL_NAME_LENGTH))
-   pos->governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(&thermal_list_lock);
@@ -762,8 +821,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz->governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(&thermal_governor_lock);
@@ -1459,6 +1519,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type && strlen(type) >= THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1549,9 +1610,15 @@ struct thermal_zo

[RFC PATCH v6 5/9] thermal: extend the cooling device API to include power information

2014-12-05 Thread Javi Merino

Add three optional callbacks to the cooling device interface to allow
them to express power.  In addition to the callbacks, add helpers to
identify cooling devices that implement the power cooling device API.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/power_allocator.txt | 27 ++
 drivers/thermal/thermal_core.c| 38 +++
 include/linux/thermal.h   | 12 ++
 3 files changed, 77 insertions(+)
 create mode 100644 Documentation/thermal/power_allocator.txt

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..d3bb79050c27
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,27 @@
+Cooling device power API
+
+
+Cooling devices controlled by this governor must supply the additional
+"power" API in their `cooling_device_ops`.  It consists on three ops:
+
+1. u32 get_actual_power(struct thermal_cooling_device *cdev);
+@cdev: The `struct thermal_cooling_device` pointer
+
+`get_actual_power()` returns the power currently consumed by the
+device in milliwatts.
+
+2. u32 state2power(struct thermal_cooling_device *cdev, unsigned long
+state);
+@cdev: The `struct thermal_cooling_device` pointer
+@state: A cooling device state
+
+Convert cooling device state @state into power consumption in
+milliwatts.
+
+3. unsigned long power2state(struct thermal_cooling_device *cdev,
+u32 power);
+@cdev: The `struct thermal_cooling_device` pointer
+@power: power in milliwatts
+
+Calculate a cooling device state that would make the device consume at
+most @power mW.
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 9021cb72a13a..c490f262ea7f 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -866,6 +866,44 @@ emul_temp_store(struct device *dev, struct 
device_attribute *attr,
 static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 #endif/*CONFIG_THERMAL_EMULATION*/
 
+/**
+ * power_actor_get_max_power() - get the maximum power that a cdev can consume
+ * @cdev:  pointer to &thermal_cooling_device
+ *
+ * Calculate the maximum power consumption in milliwats that the
+ * cooling device can currently consume.  If @cdev doesn't support the
+ * power_actor API, this function returns 0.
+ */
+u32 power_actor_get_max_power(struct thermal_cooling_device *cdev)
+{
+   if (!cdev_is_power_actor(cdev))
+   return 0;
+
+   return cdev->ops->state2power(cdev, 0);
+}
+
+/**
+ * power_actor_set_power() - limit the maximum power that a cooling device can 
consume
+ * @cdev:  pointer to &thermal_cooling_device
+ * @power: the power in milliwatts
+ *
+ * Set the cooling device to consume at most @power milliwatts.
+ *
+ * Returns: 0 on success, -EINVAL if the cooling device does not
+ * implement the power actor API or -E* for other failures.
+ */
+int power_actor_set_power(struct thermal_cooling_device *cdev, u32 power)
+{
+   unsigned long state;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   state = cdev->ops->power2state(cdev, power);
+
+   return cdev->ops->set_cur_state(cdev, state);
+}
+
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 2c14ab1f5c0d..1155457caf52 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -142,6 +142,9 @@ struct thermal_cooling_device_ops {
int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
+   u32 (*get_actual_power) (struct thermal_cooling_device *);
+   u32 (*state2power) (struct thermal_cooling_device *, unsigned long);
+   unsigned long (*power2state) (struct thermal_cooling_device *, u32);
 };
 
 struct thermal_cooling_device {
@@ -322,6 +325,15 @@ void thermal_zone_of_sensor_unregister(struct device *dev,
 }
 
 #endif
+
+static inline bool cdev_is_power_actor(struct thermal_cooling_device *cdev)
+{
+   return cdev->ops->get_actual_power && cdev->ops->state2power &&
+   cdev->ops->power2state;
+}
+
+u32 power_actor_get_max_power(struct thermal_cooling_device *);
+int power_actor_set_power(struct thermal_cooling_device *, u32);
 struct thermal_zone_device *thermal_zone_device_register(const char *, int, 
int,
void *, struct thermal_zone_device_ops *,
const struct thermal_zone_params *, int, int);
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kern

[RFC PATCH v6 8/9] thermal: add trace events to the power allocator governor

2014-12-05 Thread Javi Merino

Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Signed-off-by: Javi Merino 
---
 drivers/thermal/cpu_cooling.c  |  26 -
 drivers/thermal/power_allocator.c  |  21 +++-
 include/trace/events/thermal_power_allocator.h | 138 +
 3 files changed, 182 insertions(+), 3 deletions(-)
 create mode 100644 include/trace/events/thermal_power_allocator.h

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 335d95dd7e5a..f4d453429742 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -29,6 +29,8 @@
 #include 
 #include 
 
+#include 
+
 /**
  * struct power_table - frequency to power conversion
  * @frequency: frequency in KHz
@@ -644,12 +646,20 @@ static int cpufreq_set_cur_state(struct 
thermal_cooling_device *cdev,
 static u32 cpufreq_get_actual_power(struct thermal_cooling_device *cdev)
 {
unsigned long freq;
-   int cpu;
+   int i = 0, cpu;
u32 static_power, dynamic_power, total_load = 0;
struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+   u32 *load_cpu = NULL;
 
freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus));
 
+   if (trace_thermal_power_cpu_get_power_enabled()) {
+   u32 ncpus = cpumask_weight(&cpufreq_device->allowed_cpus);
+
+   load_cpu = devm_kcalloc(&cdev->device, ncpus, sizeof(*load_cpu),
+   GFP_KERNEL);
+   }
+
for_each_cpu(cpu, &cpufreq_device->allowed_cpus) {
u32 load;
 
@@ -659,6 +669,10 @@ static u32 cpufreq_get_actual_power(struct 
thermal_cooling_device *cdev)
load = 0;
 
total_load += load;
+   if (trace_thermal_power_cpu_limit_enabled() && load_cpu)
+   load_cpu[i] = load;
+
+   i++;
}
 
cpufreq_device->last_load = total_load;
@@ -666,6 +680,14 @@ static u32 cpufreq_get_actual_power(struct 
thermal_cooling_device *cdev)
static_power = get_static_power(cpufreq_device, freq);
dynamic_power = get_dynamic_power(cpufreq_device, freq);
 
+   if (trace_thermal_power_cpu_limit_enabled() && load_cpu) {
+   trace_thermal_power_cpu_get_power(
+   &cpufreq_device->allowed_cpus,
+   freq, load_cpu, i, dynamic_power, static_power);
+
+   devm_kfree(&cdev->device, load_cpu);
+   }
+
return static_power + dynamic_power;
 }
 
@@ -730,6 +752,8 @@ static unsigned long cpufreq_power2state(struct 
thermal_cooling_device *cdev,
return 0;
}
 
+   trace_thermal_power_cpu_limit(&cpufreq_device->allowed_cpus,
+   target_freq, cdev_state, power);
return cdev_state;
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index 09e98991efbb..fa725a36872e 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -19,6 +19,9 @@
 #include 
 #include 
 
+#define CREATE_TRACE_POINTS
+#include 
+
 #include "thermal_core.h"
 
 #define FRAC_BITS 10
@@ -157,7 +160,14 @@ static u32 pid_controller(struct thermal_zone_device *tz,
/* feed-forward the known sustainable dissipatable power */
power_range = tz->tzp->sustainable_power + frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pid(frac_to_int(err),
+   frac_to_int(params->err_integral),
+   frac_to_int(p), frac_to_int(i),
+   frac_to_int(d), power_range);
+
+   return power_range;
 }
 
 /**
@@ -238,7 +248,7 @@ static int allocate_power(struct thermal_zone_device *tz,
struct thermal_instance *instance;
u32 *req_power, *max_power, *granted_power;
u32 total_req_power, max_allocatable_power;
-   u32 power_range;
+   u32 total_granted_power, power_range;
int i, num_actors, ret = 0;
 
mutex_lock(&tz->lock);
@@ -301,6 +311,7 @@ static int allocate_power(struct thermal_zone_device *tz,
divvy_up_power(req_power, max_power, num_actors, total_req_power,
power_range, granted_power);
 
+   total_granted_power = 0;
i = 0;
list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
if (instance->trip != TRIP_MAX_DESIRED_TEMPERATURE)
@@ -310,10 +321,16 @@ static int allocate_power(struct thermal_zone_device *tz

[RFC PATCH v6 7/9] thermal: introduce the Power Allocator governor

2014-12-05 Thread Javi Merino

The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on "power actors", entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/power_allocator.txt | 196 
 drivers/thermal/Kconfig   |  15 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/power_allocator.c | 511 ++
 drivers/thermal/thermal_core.c|   7 +-
 drivers/thermal/thermal_core.h|   8 +
 include/linux/thermal.h   |  40 ++-
 7 files changed, 774 insertions(+), 4 deletions(-)
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
index d3bb79050c27..23b684afdc75 100644
--- a/Documentation/thermal/power_allocator.txt
+++ b/Documentation/thermal/power_allocator.txt
@@ -1,3 +1,172 @@
+Power allocator governor tunables
+=
+
+Trip points
+---
+
+The governor requires the following two passive trip points:
+
+1.  "switch on" trip point: temperature above which the governor
+control loop starts operating.
+2.  "desired temperature" trip point: it should be higher than the
+"switch on" trip point. It is the target temperature the governor
+is controlling for.
+
+PID Controller
+--
+
+The power allocator governor implements a
+Proportional-Integral-Derivative controller (PID controller) with
+temperature as the control input and power as the controlled output:
+
+P_max = k_p * e + k_i * err_integral + k_d * diff_err + sustainable_power
+
+where
+e = desired_temperature - current_temperature
+err_integral is the sum of previous errors
+diff_err = e - previous_error
+
+It is similar to the one depicted below:
+
+  k_d
+   |
+current_temp   |
+ | v
+ |+--+   +---+
+ | +->| diff_err |-->| X |--+
+ | |  +--+   +---+  |
+ | ||  tdpactor
+ | |  k_i   |   |get_actual_power()
+ | |   ||   || |
+ | |   ||   || | ...
+ v |   vv   vv v
+   +---+   |  +---+  +---++---+   +---+   +--+
+   | S |---+->| sum e |->| X |--->| S |-->| S |-->|power |
+   +---+   |  +---+  +---++---+   +---+   |allocation|
+ ^ |^ +--+
+ | ||| |
+ | |+---+   || |
+ | +--->| X |---+v v
+ |  +---+   granted performance
+desired_temperature   ^
+  |
+  |
+  k_po/k_pu
+
+Sustainable power
+-
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This estimates the
+sustained power that can be dissipated at the desired control
+temperature.  This is the maximum sustained power for allocation at
+the desired maximum temperature.  The actual sustained power can vary
+for a number of reasons.  The closed loop controller will take care of
+variations such as environmental conditions, and some factors related
+to the speed-grade of the silicon.  `sustainable_power` is therefore
+simply an estimate, and may be tuned to affect the aggressiveness of
+the thermal ramp.  For reference, this is 2000mW - 4500mW depending on
+screen size (4" phone - 10" tablet).
+
+If you are usi

[RFC PATCH v6 9/9] of: thermal: Introduce sustainable power for a thermal zone

2014-12-05 Thread Javi Merino

From: Punit Agrawal 

Introduce an optional property called, sustainable-power, which
represents the power (in mW) which the thermal zone can safely
dissipate.

If provided the property is parsed and associated with the thermal
zone via the thermal zone parameters.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
---
 Documentation/devicetree/bindings/thermal/thermal.txt | 4 
 drivers/thermal/of-thermal.c  | 4 
 2 files changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
b/Documentation/devicetree/bindings/thermal/thermal.txt
index f5db6b72a36f..c6eb9a8d2aed 100644
--- a/Documentation/devicetree/bindings/thermal/thermal.txt
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -167,6 +167,10 @@ Optional property:
by means of sensor ID. Additional coefficients are
interpreted as constant offset.
 
+- sustainable-power:   An estimate of the sustainable power (in mW) that the
+  Type: unsigned   thermal zone can dissipate.
+  Size: one cell
+
 Note: The delay properties are bound to the maximum dT/dt (temperature
 derivative over time) in two situations for a thermal zone:
 (i)  - when passive cooling is activated (polling-delay-passive); and
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index 62143ba31001..e032b9bf4085 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -794,6 +794,7 @@ int __init of_parse_thermal_zones(void)
for_each_child_of_node(np, child) {
struct thermal_zone_device *zone;
struct thermal_zone_params *tzp;
+   u32 prop;
 
/* Check whether child is enabled or not */
if (!of_device_is_available(child))
@@ -820,6 +821,9 @@ int __init of_parse_thermal_zones(void)
/* No hwmon because there might be hwmon drivers registering */
tzp->no_hwmon = true;
 
+   if (!of_property_read_u32(child, "sustainable-power", &prop))
+   tzp->sustainable_power = prop;
+
zone = thermal_zone_device_register(child->name, tz->ntrips,
0, tz,
ops, tzp,
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-05 Thread Javi Merino

Add a basic power model to the cpu cooling device to implement the
power cooling device API.  The power model uses the current frequency,
current load and OPPs for the power calculations.  The cpus must have
registered their OPPs using the OPP library.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/cpu-cooling-api.txt | 144 +-
 drivers/thermal/cpu_cooling.c | 431 +-
 include/linux/cpu_cooling.h   |  49 +++-
 3 files changed, 611 insertions(+), 13 deletions(-)

diff --git a/Documentation/thermal/cpu-cooling-api.txt 
b/Documentation/thermal/cpu-cooling-api.txt
index fca24c931ec8..d438a900e374 100644
--- a/Documentation/thermal/cpu-cooling-api.txt
+++ b/Documentation/thermal/cpu-cooling-api.txt
@@ -25,8 +25,150 @@ the user. The registration APIs returns the cooling device 
pointer.
 
clip_cpus: cpumask of cpus where the frequency constraints will happen.
 
-1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
+1.1.2 struct thermal_cooling_device *cpufreq_power_cooling_register(
+const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_cooling_register, this function registers a cpufreq
+cooling device.  Using this function, the cooling device will
+implement the power extensions by using a simple cpu power model.  The
+cpus must have registered their OPPs using the OPP library.
+
+The additional parameters are needed for the power model (See 2. Power
+models).  "capacitance" is the dynamic power coefficient (See 2.1
+Dynamic power).  "plat_static_func" is a function to calculate the
+static power consumed by these cpus (See 2.2 Static power).
+
+1.1.3 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
+struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_power_cooling_register, this function register a
+cpufreq cooling device with power extensions using the device tree
+information supplied by the np parameter.
+
+1.1.4 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
 This interface function unregisters the "thermal-cpufreq-%x" cooling 
device.
 
 cdev: Cooling device pointer which has to be unregistered.
+
+2. Power models
+
+The power API registration functions provide a simple power model for
+CPUs.  The current power is calculated as dynamic + (optionally)
+static power.  This power model requires that the operating-points of
+the CPUs are registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using the `cpufreq-cpu0.c` driver then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
+and `of_cpufreq_power_cooling_register()` is optional.  If you don't
+provide it, only dynamic power will be considered.
+
+2.1 Dynamic power
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+2.2 Static power
+
+Static leakage power consumption depends on a number of factors.  For a
+given circuit implementation the primary factors are:
+
+-

[RFC PATCH v6 3/9] tools lib traceevent: Add support for __print_u{8,16,32,64}_array()

2014-12-05 Thread Javi Merino

Trace can now generate traces with u8, u16, u32 and u64 dynamic
arrays.  Add support to parse them.

Cc: Arnaldo Carvalho de Melo 
Cc: Steven Rostedt 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 62 +++---
 tools/lib/traceevent/event-parse.h |  4 +++
 2 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index f12ea53cc83b..f67260bddd65 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -753,6 +753,10 @@ static void free_arg(struct print_arg *arg)
free_arg(arg->symbol.field);
free_flag_sym(arg->symbol.symbols);
break;
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
free_arg(arg->num.field);
free_arg(arg->num.size);
@@ -2827,6 +2831,22 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, "__print_u8_array") == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U8);
+   }
+   if (strcmp(token, "__print_u16_array") == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U16);
+   }
+   if (strcmp(token, "__print_u32_array") == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U32);
+   }
+   if (strcmp(token, "__print_u64_array") == 0) {
+   free_token(token);
+   return process_num(event, arg, tok, PRINT_U64);
+   }
if (strcmp(token, "__get_str") == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3355,6 +3375,10 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3660,7 +3684,7 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
unsigned long long val, fval;
unsigned long addr;
char *str;
-   unsigned char *hex;
+   void *num;
int print;
int i, len;
 
@@ -3739,13 +3763,17 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
}
break;
+   case PRINT_U8:
+   case PRINT_U16:
+   case PRINT_U32:
+   case PRINT_U64:
case PRINT_HEX:
if (arg->num.field->type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
data + arg->num.field->dynarray.field->offset,
arg->num.field->dynarray.field->size);
-   hex = data + (offset & 0x);
+   num = data + (offset & 0x);
} else {
field = arg->num.field->field.field;
if (!field) {
@@ -3755,13 +3783,24 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
goto out_warning_field;
arg->num.field->field.field = field;
}
-   hex = data + field->offset;
+   num = data + field->offset;
}
len = eval_num_arg(data, size, event, arg->num.size);
for (i = 0; i < len; i++) {
if (i)
trace_seq_putc(s, ' ');
-   trace_seq_printf(s, "%02x", hex[i]);
+   if (arg->type == PRINT_HEX)
+   trace_seq_printf(s, "%02x",
+   ((uint8_t *)num)[i]);
+   else if (arg->type == PRINT_U8)
+   trace_seq_printf(s, "%u", ((uint8_t *)num)[i]);
+   else if (arg->type == PRINT_U16)
+   trace_seq_printf(s, "%u", ((uint16_t *)num)[i]);
+   else if (arg->type == PRINT_U32)
+   trace_seq_printf(s, "%u", ((uint32_t *)num)[i]);
+   else/* PRINT_U64 */
+   trace_seq_printf(s, "%lu",
+   ((uint64_t

[RFC PATCH v6 2/9] tools lib traceevent: Generalize numeric argument

2014-12-05 Thread Javi Merino

Numeric arguments can be in different bases, so rename it to num so
that they can be used for formats other than PRINT_HEX

Cc: Steven Rostedt 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 26 +-
 tools/lib/traceevent/event-parse.h |  4 ++--
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cf3a44bf1ec3..f12ea53cc83b 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -754,8 +754,8 @@ static void free_arg(struct print_arg *arg)
free_flag_sym(arg->symbol.symbols);
break;
case PRINT_HEX:
-   free_arg(arg->hex.field);
-   free_arg(arg->hex.size);
+   free_arg(arg->num.field);
+   free_arg(arg->num.size);
break;
case PRINT_TYPE:
free(arg->typecast.type);
@@ -2503,7 +2503,7 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
if (test_type_token(type, token, EVENT_DELIM, ","))
goto out_free;
 
-   arg->hex.field = field;
+   arg->num.field = field;
 
free_token(token);
 
@@ -2519,7 +2519,7 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
if (test_type_token(type, token, EVENT_DELIM, ")"))
goto out_free;
 
-   arg->hex.size = field;
+   arg->num.size = field;
 
free_token(token);
type = read_token_item(tok);
@@ -3740,24 +3740,24 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
case PRINT_HEX:
-   if (arg->hex.field->type == PRINT_DYNAMIC_ARRAY) {
+   if (arg->num.field->type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
-   data + arg->hex.field->dynarray.field->offset,
-   arg->hex.field->dynarray.field->size);
+   data + arg->num.field->dynarray.field->offset,
+   arg->num.field->dynarray.field->size);
hex = data + (offset & 0x);
} else {
-   field = arg->hex.field->field.field;
+   field = arg->num.field->field.field;
if (!field) {
-   str = arg->hex.field->field.name;
+   str = arg->num.field->field.name;
field = pevent_find_any_field(event, str);
if (!field)
goto out_warning_field;
-   arg->hex.field->field.field = field;
+   arg->num.field->field.field = field;
}
hex = data + field->offset;
}
-   len = eval_num_arg(data, size, event, arg->hex.size);
+   len = eval_num_arg(data, size, event, arg->num.size);
for (i = 0; i < len; i++) {
if (i)
trace_seq_putc(s, ' ');
@@ -4923,9 +4923,9 @@ static void print_args(struct print_arg *args)
break;
case PRINT_HEX:
printf("__print_hex(");
-   print_args(args->hex.field);
+   print_args(args->num.field);
printf(", ");
-   print_args(args->hex.size);
+   print_args(args->num.size);
printf(")");
break;
case PRINT_STRING:
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 7a3873ff9a4f..2bf72e908a74 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -240,7 +240,7 @@ struct print_arg_symbol {
struct print_flag_sym   *symbols;
 };
 
-struct print_arg_hex {
+struct print_arg_num {
struct print_arg*field;
struct print_arg*size;
 };
@@ -291,7 +291,7 @@ struct print_arg {
struct print_arg_typecast   typecast;
struct print_arg_flags  flags;
struct print_arg_symbol symbol;
-   struct print_arg_hexhex;
+   struct print_arg_numnum;
struct print_arg_func   func;
struct print_arg_string string;
struct print_arg_bitmaskbitmask;
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-08 Thread Javi Merino

On Mon, Dec 08, 2014 at 05:49:00AM +, Viresh Kumar wrote:
> Hi Javi,

Hi Viresh,

> Looks like ARM's exchange server screwed up your patch?
> 
> This is how I see it with gmail's show-original option:
> 
> +=09cpufreq_device->dyn_power_table =3D power_table;
> +=09cpufreq_device->dyn_power_table_entries =3D i;
> +
> 
> I have seen this a lot, while I was in ARM. Had to adopt some work-arounds to
> get over it. :)

Sigh.  Care to share them (privately I guess)?
 
> On Sat, Dec 6, 2014 at 12:34 AM, Javi Merino  wrote:
> 
> > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> 
> > +static int build_dyn_power_table(struct cpufreq_cooling_device 
> > *cpufreq_device,
> > +   u32 capacitance)
> > +{
> > +   struct power_table *power_table;
> > +   struct dev_pm_opp *opp;
> > +   struct device *dev = NULL;
> > +   int num_opps, cpu, i, ret = 0;
> 
> Why not initialize num_opps and i to 0 here?

ok

> > +   unsigned long freq;
> > +
> > +   num_opps = 0;
> > +
> > +   rcu_read_lock();
> > +
> > +   for_each_cpu(cpu, &cpufreq_device->allowed_cpus) {
> 
> All these CPUs must be sharing the OPPs as they must be supplied
> from a single clock line. But probably you need to iterate over all
> because you don't know which ones share OPP. Right ? Probably
> the work I am doing around getting new OPP bindings might solve
> this..

Is this loop pointless?  I seem to recall that it was needed but I
forgot the details.  If you think it is, I can remove it.

> > +   dev = get_cpu_device(cpu);
> > +   if (!dev)
> 
> Is this allowed? I understand you can continue, but this is not
> possible. Right ? So, print a error here?

Ok, now it prints an error.

> > +   continue;
> > +
> > +   num_opps = dev_pm_opp_get_opp_count(dev);
> > +   if (num_opps > 0) {
> > +   break;
> > +   } else if (num_opps < 0) {
> > +   ret = num_opps;
> > +   goto unlock;
> > +   }
> > +   }
> > +
> > +   if (num_opps == 0) {
> > +   ret = -EINVAL;
> > +   goto unlock;
> > +   }
> > +
> > +   power_table = kcalloc(num_opps, sizeof(*power_table), GFP_KERNEL);
> > +
> > +   i = 0;
> 
> Either initialize i at the beginning or in the initialization part of
> for loop below.

As part of the for loop.
 
> > +   for (freq = 0;
> > +opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
> > +freq++) {
> > +   u32 freq_mhz, voltage_mv;
> > +   u64 power;
> > +
> > +   freq_mhz = freq / 100;
> > +   voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
> > +
> > +   /*
> > +* Do the multiplication with MHz and millivolt so as
> > +* to not overflow.
> > +*/
> > +   power = (u64)capacitance * freq_mhz * voltage_mv * 
> > voltage_mv;
> > +   do_div(power, 10);
> > +
> > +   /* frequency is stored in power_table in KHz */
> > +   power_table[i].frequency = freq / 1000;
> > +   power_table[i].power = power;
> > +
> > +   i++;
> 
> Why here and not with freq++?

As part of the for loop as well.
 
> > +   }
> > +
> > +   if (i == 0) {
> > +   ret = PTR_ERR(opp);
> > +   goto unlock;
> > +   }
> > +
> > +   cpufreq_device->dyn_power_table = power_table;
> > +   cpufreq_device->dyn_power_table_entries = i;
> > +
> > +unlock:
> > +   rcu_read_unlock();
> > +   return ret;
> > +}
> > +
> > +static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_device,
> > +   u32 freq)
> 
> Because the patch is screwed up a bit, I really can't see if the 'u'
> or u32 is directly
> below the 's' of struct cpufreq_cooling_device. Running checkpatch with 
> --strict
> will take care of that probably. Sorry if you have already taken care of 
> that..

It wasn't.  I'll run checkpatch with --strict on next submission.

> > +{
> > +   int i;
> > +   struct power_table *pt = cpufreq_device->dyn_power_table;
> > +
> > +   for (i = 1;

Re: [RFC PATCH v6 6/9] thermal: cpu_cooling: implement the power cooling device API

2014-12-08 Thread Javi Merino

On Mon, Dec 08, 2014 at 01:31:35PM +, Viresh Kumar wrote:
> On 8 December 2014 at 18:20, Javi Merino  wrote:
> > Is this loop pointless?  I seem to recall that it was needed but I
> > forgot the details.  If you think it is, I can remove it.
> 
> Yes it is pointless. The CPUs you are iterating on, share clock lines
> and so they will have same set of OPPs. Just do this for the cpu
> we are registering the cooling device.

Ok, changed it into:

cpu = cpumask_any(&cpufreq_device->allowed_cpus);
dev = get_cpu_device(cpu);
if (!dev) {
dev_warn(&cpufreq_device->cool_dev->device,
"No cpu device for cpu %d\n", cpu);
ret = -EINVAL;
goto unlock;
}

num_opps = dev_pm_opp_get_opp_count(dev);
if (num_opps <= 0) {
ret = (num_opps < 0)? num_opps : -EINVAL;
goto unlock;
}

Thanks!
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v6 4/9] thermal: let governors have private data for each thermal zone

2015-01-23 Thread Javi Merino

Hi Rui,

On Mon, Dec 08, 2014 at 04:11:32AM +, Zhang Rui wrote:
> On Fri, 2014-12-05 at 19:04 +0000, Javi Merino wrote:
> > A governor may need to store its current state between calls to
> > throttle().  That state depends on the thermal zone, so store it as
> > private data in struct thermal_zone_device.
> > 
> > The governors may have two new ops: bind_to_tz() and unbind_from_tz().
> > When provided, these functions let governors do some initialization
> > and teardown when they are bound/unbound to a tz and possibly store that
> > information in the governor_data field of the struct
> > thermal_zone_device.
> > 
> > Cc: Zhang Rui 
> > Cc: Eduardo Valentin 
> > Signed-off-by: Javi Merino 
> 
> applied.
> 
> thanks,
> rui

Where can I find it?  Your next branch in git.kernel.org doesn't have
it.  I'm preparing an update of this series and I wanted to based it
on a branch that had this commit applied.  Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/3] Add array printing helpers to ftrace

2015-01-26 Thread Javi Merino

This series add a helper to the tracing framework to trace arrays.
Patch 1 adds them and patches 2 and 3 update the traceevent library to
parse them.  They've been tested with trace-cmd.

Changes since v3[0]:
  - use %zu to print size_t

Changes since v2:
  - Changed BUG() into a trace_seq_printf()
  - Add BUILD_BUG_ON() to chase mistakes in element sizes on build
  - Add patch 2 to avoid repeating code in patch 3.
  - print a warning in traeevent if the size of the array is not valid

[0] http://thread.gmane.org/gmane.linux.kernel/1869110

Dave Martin (1):
  tracing: Add array printing helpers

Javi Merino (2):
  tools lib traceevent: factor out allocating and processing args
  tools lib traceevent: Add support for __print_array()

 include/linux/ftrace_event.h   |   4 +
 include/trace/ftrace.h |   9 +++
 kernel/trace/trace_output.c|  44 ++
 tools/lib/traceevent/event-parse.c | 159 +
 tools/lib/traceevent/event-parse.h |   8 ++
 5 files changed, 193 insertions(+), 31 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 2/3] tools lib traceevent: factor out allocating and processing args

2015-01-26 Thread Javi Merino

The sequence of allocating the print_arg field, calling process_arg()
and verifying that the next event delimiter is repeated twice in
process_hex() and will also be used for process_int_array().  Factor it
out to a function to avoid writing the same code again and again.

Cc: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Steven Rostedt 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 77 --
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cf3a44bf1ec3..dabd8f5c6398 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -2013,6 +2013,38 @@ process_entry(struct event_format *event __maybe_unused, 
struct print_arg *arg,
return EVENT_ERROR;
 }
 
+static int alloc_and_process_arg(struct event_format *event, char *next_token,
+struct print_arg **print_arg)
+{
+   struct print_arg *field;
+   enum event_type type;
+   char *token;
+   int ret = 0;
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, "%s: not enough memory!", __func__);
+   errno = ENOMEM;
+   return -1;
+   }
+
+   type = process_arg(event, field, &token);
+
+   if (test_type_token(type, token, EVENT_DELIM, next_token)) {
+   errno = EINVAL;
+   ret = -1;
+   free_arg(field);
+   goto out_free_token;
+   }
+
+   *print_arg = field;
+
+out_free_token:
+   free_token(token);
+
+   return ret;
+}
+
 static char *arg_eval (struct print_arg *arg);
 
 static unsigned long long
@@ -2485,49 +2517,20 @@ out_free:
 static enum event_type
 process_hex(struct event_format *event, struct print_arg *arg, char **tok)
 {
-   struct print_arg *field;
-   enum event_type type;
-   char *token = NULL;
-
memset(arg, 0, sizeof(*arg));
arg->type = PRINT_HEX;
 
-   field = alloc_arg();
-   if (!field) {
-   do_warning_event(event, "%s: not enough memory!", __func__);
-   goto out_free;
-   }
-
-   type = process_arg(event, field, &token);
-
-   if (test_type_token(type, token, EVENT_DELIM, ","))
-   goto out_free;
-
-   arg->hex.field = field;
-
-   free_token(token);
-
-   field = alloc_arg();
-   if (!field) {
-   do_warning_event(event, "%s: not enough memory!", __func__);
-   *tok = NULL;
-   return EVENT_ERROR;
-   }
-
-   type = process_arg(event, field, &token);
-
-   if (test_type_token(type, token, EVENT_DELIM, ")"))
-   goto out_free;
+   if (alloc_and_process_arg(event, ",", &arg->hex.field))
+   goto out;
 
-   arg->hex.size = field;
+   if (alloc_and_process_arg(event, ")", &arg->hex.size))
+   goto free_field;
 
-   free_token(token);
-   type = read_token_item(tok);
-   return type;
+   return read_token_item(tok);
 
- out_free:
-   free_arg(field);
-   free_token(token);
+free_field:
+   free_arg(arg->hex.field);
+out:
*tok = NULL;
return EVENT_ERROR;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/3] tracing: Add array printing helpers

2015-01-26 Thread Javi Merino

From: Dave Martin 

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print__array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Signed-off-by: Dave Martin 
Signed-off-by: Javi Merino 
---
 include/linux/ftrace_event.h |  4 
 include/trace/ftrace.h   |  9 +
 kernel/trace/trace_output.c  | 44 
 3 files changed, 57 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 0bebb5c348b8..5aa4a9269547 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_array_seq(struct trace_seq *p,
+  const void *buf, int buf_len,
+  size_t el_size);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 139b5067345b..36afd0ed3458 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,14 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_array
+#define __print_array(array, count, el_size)   \
+   ({  \
+   BUILD_BUG_ON(el_size != 8 && el_size != 16 &&   \
+   el_size != 32 && el_size != 64);\
+   ftrace_print_array_seq(p, array, count, el_size);   \
+   })
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -674,6 +682,7 @@ static inline void ftrace_test_probe_##call(void)   
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b77b9a697619..c06bd6e4ae0e 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -177,6 +177,50 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  size_t el_size)
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = "";
+   void *ptr = (void *)buf;
+
+   trace_seq_putc(p, '{');
+
+   while (ptr < buf + buf_len) {
+   switch (el_size) {
+   case 8:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u8 *)ptr);
+   break;
+   case 16:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u16 *)ptr);
+   break;
+   case 32:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u32 *)ptr);
+   break;
+   case 64:
+   trace_seq_printf(p, "%s0x%llx", prefix,
+*(u64 *)ptr);
+   break;
+   default:
+   trace_seq_printf(p, "BAD SIZE:%zu 0x%x", el_size,
+*(u8 *)ptr);
+   el_size = 8;
+   }
+   prefix = ",";
+   ptr += el_size / 8;
+   }
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+EXPORT_SYMBOL(ftrace_print_arr

[PATCH v4 3/3] tools lib traceevent: Add support for __print_array()

2015-01-26 Thread Javi Merino

Trace can now generate traces with variable element size arrays.  Add
support to parse them.

Cc: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Steven Rostedt 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 94 ++
 tools/lib/traceevent/event-parse.h |  8 
 2 files changed, 102 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index dabd8f5c6398..9cb05c440821 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -757,6 +757,11 @@ static void free_arg(struct print_arg *arg)
free_arg(arg->hex.field);
free_arg(arg->hex.size);
break;
+   case PRINT_INT_ARRAY:
+   free_arg(arg->int_array.field);
+   free_arg(arg->int_array.size);
+   free_arg(arg->int_array.el_size);
+   break;
case PRINT_TYPE:
free(arg->typecast.type);
free_arg(arg->typecast.item);
@@ -2536,6 +2541,32 @@ out:
 }
 
 static enum event_type
+process_int_array(struct event_format *event, struct print_arg *arg, char 
**tok)
+{
+   memset(arg, 0, sizeof(*arg));
+   arg->type = PRINT_INT_ARRAY;
+
+   if (alloc_and_process_arg(event, ",", &arg->int_array.field))
+   goto out;
+
+   if (alloc_and_process_arg(event, ",", &arg->int_array.size))
+   goto free_field;
+
+   if (alloc_and_process_arg(event, ")", &arg->int_array.el_size))
+   goto free_size;
+
+   return read_token_item(tok);
+
+free_size:
+   free_arg(arg->int_array.size);
+free_field:
+   free_arg(arg->int_array.field);
+out:
+   *tok = NULL;
+   return EVENT_ERROR;
+}
+
+static enum event_type
 process_dynamic_array(struct event_format *event, struct print_arg *arg, char 
**tok)
 {
struct format_field *field;
@@ -2830,6 +2861,10 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, "__print_array") == 0) {
+   free_token(token);
+   return process_int_array(event, arg, tok);
+   }
if (strcmp(token, "__get_str") == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3358,6 +3393,7 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_INT_ARRAY:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3768,6 +3804,55 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
 
+   case PRINT_INT_ARRAY: {
+   void *num;
+   int el_size;
+
+   if (arg->int_array.field->type == PRINT_DYNAMIC_ARRAY) {
+   unsigned long offset;
+   struct format_field *field =
+   arg->int_array.field->dynarray.field;
+   offset = pevent_read_number(pevent,
+   data + field->offset,
+   field->size);
+   num = data + (offset & 0x);
+   } else {
+   field = arg->int_array.field->field.field;
+   if (!field) {
+   str = arg->int_array.field->field.name;
+   field = pevent_find_any_field(event, str);
+   if (!field)
+   goto out_warning_field;
+   arg->int_array.field->field.field = field;
+   }
+   num = data + field->offset;
+   }
+   len = eval_num_arg(data, size, event, arg->int_array.size);
+   el_size = eval_num_arg(data, size, event,
+  arg->int_array.el_size);
+   el_size /= 8;
+   for (i = 0; i < len; i++) {
+   if (i)
+   trace_seq_putc(s, ' ');
+
+   if (el_size == 1) {
+   trace_seq_printf(s, "%u", *(uint8_t *)num);
+   } else if (el_size == 2) {
+   trace_seq_printf(s, "%u", *(uint16_t *)num);
+   } else if (el_size == 4) {
+   trace_seq_printf(s, "%u", *(uint32_t *)num);
+   } else if (el_s

[PATCH] sysfs: fix warning when creating a sysfs group without attributes

2015-01-15 Thread Javi Merino

When attempting to create a gropu without attrs, the warning prints the
name of the group.  However, the check for name being a NULL pointer is
wrong: it uses the pointer to the name when it's NULL.  Fix it to use
the name if present, otherwise just put an empty string.

Cc: Bruno Prémont 
Cc: Greg Kroah-Hartman 
Signed-off-by: Javi Merino 
---
 fs/sysfs/group.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 7d2a860ba788..2554d8835b48 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -99,7 +99,7 @@ static int internal_create_group(struct kobject *kobj, int 
update,
return -EINVAL;
if (!grp->attrs && !grp->bin_attrs) {
WARN(1, "sysfs: (bin_)attrs not set by subsystem for group: 
%s/%s\n",
-   kobj->name, grp->name ? "" : grp->name);
+   kobj->name, grp->name ?: "");
return -EINVAL;
}
if (grp->name) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v2 2/2] tools lib traceevent: Add support for __print_array()

2015-01-15 Thread Javi Merino

Trace can now generate traces with variable element size arrays.  Add
support to parse them.

Cc: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Steven Rostedt 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 127 +
 tools/lib/traceevent/event-parse.h |   8 +++
 2 files changed, 135 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cf3a44bf1ec3..00dd6213449c 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -757,6 +757,11 @@ static void free_arg(struct print_arg *arg)
free_arg(arg->hex.field);
free_arg(arg->hex.size);
break;
+   case PRINT_INT_ARRAY:
+   free_arg(arg->int_array.field);
+   free_arg(arg->int_array.size);
+   free_arg(arg->int_array.el_size);
+   break;
case PRINT_TYPE:
free(arg->typecast.type);
free_arg(arg->typecast.item);
@@ -2533,6 +2538,71 @@ process_hex(struct event_format *event, struct print_arg 
*arg, char **tok)
 }
 
 static enum event_type
+process_int_array(struct event_format *event, struct print_arg *arg, char 
**tok)
+{
+   struct print_arg *field;
+   enum event_type type;
+   char *token;
+
+   memset(arg, 0, sizeof(*arg));
+   arg->type = PRINT_INT_ARRAY;
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, "%s: not enough memory!", __func__);
+   goto out;
+   }
+
+   type = process_arg(event, field, &token);
+
+   if (test_type_token(type, token, EVENT_DELIM, ","))
+   goto out_free;
+
+   arg->int_array.field = field;
+
+   free_token(token);
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, "%s: not enough memory!", __func__);
+   goto out;
+   }
+
+   type = process_arg(event, field, &token);
+
+   if (test_type_token(type, token, EVENT_DELIM, ","))
+   goto out_free;
+
+   arg->int_array.size = field;
+
+   free_token(token);
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, "%s: not enough memory!", __func__);
+   goto out;
+   }
+
+   type = process_arg(event, field, &token);
+
+   if (test_type_token(type, token, EVENT_DELIM, ")"))
+   goto out_free;
+
+   arg->int_array.el_size = field;
+
+   free_token(token);
+   type = read_token_item(tok);
+   return type;
+
+ out_free:
+   free_arg(field);
+   free_token(token);
+out:
+   *tok = NULL;
+   return EVENT_ERROR;
+}
+
+static enum event_type
 process_dynamic_array(struct event_format *event, struct print_arg *arg, char 
**tok)
 {
struct format_field *field;
@@ -2827,6 +2897,10 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, "__print_array") == 0) {
+   free_token(token);
+   return process_int_array(event, arg, tok);
+   }
if (strcmp(token, "__get_str") == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3355,6 +3429,7 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_INT_ARRAY:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3765,6 +3840,49 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
 
+   case PRINT_INT_ARRAY: {
+   void *num;
+   int el_size;
+
+   if (arg->int_array.field->type == PRINT_DYNAMIC_ARRAY) {
+   unsigned long offset;
+
+   offset = pevent_read_number(pevent,
+data + 
arg->int_array.field->dynarray.field->offset,
+   arg->int_array.field->dynarray.field->size);
+   num = data + (offset & 0x);
+   } else {
+   field = arg->int_array.field->field.field;
+   if (!field) {
+   str = arg->int_array.field->field.name;
+   field = pevent_find_any_field(event, str);
+   if (!field)
+   goto out_warning_field;
+   arg->int_array.field->field.field = field;
+   }
+   num = d

[RESEND PATCH v2 1/2] tracing: Add array printing helpers

2015-01-15 Thread Javi Merino

From: Dave Martin 

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print__array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Signed-off-by: Dave Martin 
Signed-off-by: Javi Merino 
---
Changes since v1[0]

- Replaced the DEFINE_PRINT_ARRAY macros with a single
  ftrace_print_array_seq() function.

[0] http://thread.gmane.org/gmane.linux.kernel/1845418/focus=54110

 include/linux/ftrace_event.h |  4 
 include/trace/ftrace.h   |  5 +
 kernel/trace/trace_output.c  | 42 ++
 3 files changed, 51 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 0bebb5c348b8..5aa4a9269547 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_array_seq(struct trace_seq *p,
+  const void *buf, int buf_len,
+  size_t el_size);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 139b5067345b..da911289a8dd 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,10 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_array
+#define __print_array(array, count, el_size)   \
+   ftrace_print_array_seq(p, array, count, el_size)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -674,6 +678,7 @@ static inline void ftrace_test_probe_##call(void)   
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b77b9a697619..6cee7c36a669 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -177,6 +177,48 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  size_t el_size)
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = "";
+   void *ptr = (void *)buf;
+
+   trace_seq_putc(p, '{');
+
+   while (ptr < buf + buf_len) {
+   switch (el_size) {
+   case 8:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u8 *)ptr);
+   break;
+   case 16:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u16 *)ptr);
+   break;
+   case 32:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u32 *)ptr);
+   break;
+   case 64:
+   trace_seq_printf(p, "%s0x%llx", prefix,
+*(u64 *)ptr);
+   break;
+   default:
+   BUG();
+   }
+   prefix = ",";
+   ptr += el_size / 8;
+   }
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+EXPORT_SYMBOL(ftrace_print_array_seq);
+
 int ftrace_raw_output_prep(struct trace_iterator *iter,
   struct trace_event *trace_event)
 {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message t

Re: [RESEND PATCH v2 1/2] tracing: Add array printing helpers

2015-01-16 Thread Javi Merino

On Fri, Jan 16, 2015 at 02:22:02AM +, Steven Rostedt wrote:
> On Thu, 15 Jan 2015 16:50:58 +
> Javi Merino  wrote:
>  
> > +const char *
> > +ftrace_print_array_seq(struct trace_seq *p, const void *buf, int
> > buf_len,
> > +  size_t el_size)
> > +{
> > +   const char *ret = trace_seq_buffer_ptr(p);
> > +   const char *prefix = "";
> > +   void *ptr = (void *)buf;
> > +
> > +   trace_seq_putc(p, '{');
> > +
> > +   while (ptr < buf + buf_len) {
> > +   switch (el_size) {
> > +   case 8:
> > +   trace_seq_printf(p, "%s0x%x", prefix,
> > +*(u8 *)ptr);
> > +   break;
> > +   case 16:
> > +   trace_seq_printf(p, "%s0x%x", prefix,
> > +*(u16 *)ptr);
> > +   break;
> > +   case 32:
> > +   trace_seq_printf(p, "%s0x%x", prefix,
> > +*(u32 *)ptr);
> > +   break;
> > +   case 64:
> > +   trace_seq_printf(p, "%s0x%llx", prefix,
> > +*(u64 *)ptr);
> > +   break;
> > +   default:
> > +   BUG();
> 
> BUG() is a bit extreme don't you think? I'm not sure it even deserves a
> WARN_ON().

Ok, I used BUG() because that's what you suggested:

http://article.gmane.org/gmane.linux.kernel/1846749

The only way I could think of turning it into a BUILD_BUG was by
moving it to the __print_array macro, but I think it's ugly.

> I would suggest doing:
> 
>   trace_seq_printf(p, "BAD SIZE:%d 0x%x", el_size,
>   *(u8 *)ptr);
>   el_size = 8;
> 
> No need to go crashing the kernel or even messing with dmesg over
> somebody's tracepoint mistake.

Ok, I'll change it to that.

> The rest looks fine.
> 
> > +   }
> > +   prefix = ",";
> > +   ptr += el_size / 8;
> > +   }
> > +
> > +   trace_seq_putc(p, '}');
> > +   trace_seq_putc(p, 0);
> 
> I need to add a trace_seq_terminate() for this.

That would make it more readable.  Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND PATCH v2 2/2] tools lib traceevent: Add support for __print_array()

2015-01-16 Thread Javi Merino

On Fri, Jan 16, 2015 at 02:35:19AM +, Steven Rostedt wrote:
> On Thu, 15 Jan 2015 12:05:52 -0500
> Javi Merino  wrote:
> 
> > Trace can now generate traces with variable element size arrays.  Add
> > support to parse them.
> > 
> > Cc: Namhyung Kim 
> > Cc: Arnaldo Carvalho de Melo 
> > Cc: Steven Rostedt 
> > Cc: Jiri Olsa 
> > Signed-off-by: Javi Merino 
> > ---
> >  tools/lib/traceevent/event-parse.c | 127
> > +
> > tools/lib/traceevent/event-parse.h |   8 +++ 2 files changed, 135
> > insertions(+)
> > 
> > diff --git a/tools/lib/traceevent/event-parse.c
> > b/tools/lib/traceevent/event-parse.c index cf3a44bf1ec3..00dd6213449c
> > 100644 --- a/tools/lib/traceevent/event-parse.c
> > +++ b/tools/lib/traceevent/event-parse.c
> > @@ -757,6 +757,11 @@ static void free_arg(struct print_arg *arg)
> > free_arg(arg->hex.field);
> > free_arg(arg->hex.size);
> > break;
> > +   case PRINT_INT_ARRAY:
> > +   free_arg(arg->int_array.field);
> > +   free_arg(arg->int_array.size);
> > +   free_arg(arg->int_array.el_size);
> > +   break;
> > case PRINT_TYPE:
> > free(arg->typecast.type);
> > free_arg(arg->typecast.item);
> > @@ -2533,6 +2538,71 @@ process_hex(struct event_format *event, struct
> > print_arg *arg, char **tok) }
> >  
> >  static enum event_type
> > +process_int_array(struct event_format *event, struct print_arg *arg,
> > char **tok) +{
> > +   struct print_arg *field;
> > +   enum event_type type;
> > +   char *token;
> > +
> > +   memset(arg, 0, sizeof(*arg));
> > +   arg->type = PRINT_INT_ARRAY;
> > +
> > +   field = alloc_arg();
> > +   if (!field) {
> > +   do_warning_event(event, "%s: not enough memory!",
> > __func__);
> > +   goto out;
> > +   }
> > +
> > +   type = process_arg(event, field, &token);
> > +
> > +   if (test_type_token(type, token, EVENT_DELIM, ","))
> > +   goto out_free;
> > +
> > +   arg->int_array.field = field;
> > +
> > +   free_token(token);
> > +
> > +   field = alloc_arg();
> > +   if (!field) {
> > +   do_warning_event(event, "%s: not enough memory!",
> > __func__);
> > +   goto out;
> > +   }
> > +
> > +   type = process_arg(event, field, &token);
> > +
> > +   if (test_type_token(type, token, EVENT_DELIM, ","))
> > +   goto out_free;
> > +
> > +   arg->int_array.size = field;
> > +
> > +   free_token(token);
> > +
> > +   field = alloc_arg();
> > +   if (!field) {
> > +   do_warning_event(event, "%s: not enough memory!",
> > __func__);
> > +   goto out;
> > +   }
> 
> Hmm, perhaps we should make a helper function to allocate the field and
> show the warning for the event instead of duplicating the code three
> times.

Ok, I'll also use it for the two similar allocation code done in
process_hex()

> > +
> > +   type = process_arg(event, field, &token);
> > +
> > +   if (test_type_token(type, token, EVENT_DELIM, ")"))
> > +   goto out_free;
> > +
> > +   arg->int_array.el_size = field;
> > +
> > +   free_token(token);
> > +   type = read_token_item(tok);
> > +   return type;
> > +
> > + out_free:
> > +   free_arg(field);
> > +   free_token(token);
> > +out:
> > +   *tok = NULL;
> > +   return EVENT_ERROR;
> > +}
> > +
> > +static enum event_type
> >  process_dynamic_array(struct event_format *event, struct print_arg
> > *arg, char **tok) {
> > struct format_field *field;
> > @@ -2827,6 +2897,10 @@ process_function(struct event_format *event,
> > struct print_arg *arg, free_token(token);
> > return process_hex(event, arg, tok);
> > }
> > +   if (strcmp(token, "__print_array") == 0) {
> > +   free_token(token);
> > +   return process_int_array(event, arg, tok);
> > +   }
> > if (strcmp(token, "__get_str") == 0) {
> > free_token(token);
> > return process_str(event, arg, tok);
> > @@ -3355,6 +3429,7 @@ eval_num_arg(void *data, int size, struct
> > event_format *event, struct print_arg break;
> > case PRINT_FLAGS:
> > case

[PATCH v3 1/3] tracing: Add array printing helpers

2015-01-19 Thread Javi Merino

From: Dave Martin 

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements common __print__array() helpers that
tracepoint implementations can use instead of reinventing them.  A
simple C-style syntax is used to delimit the array and its elements
{like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Signed-off-by: Dave Martin 
Signed-off-by: Javi Merino 
---

Changes since v2[0]:
  - Changed BUG() into a trace_seq_printf()
  - Add BUILD_BUG_ON() to chase mistakes in element sizes on build

[0] http://article.gmane.org/gmane.linux.kernel/1867165

 include/linux/ftrace_event.h |  4 
 include/trace/ftrace.h   |  9 +
 kernel/trace/trace_output.c  | 44 
 3 files changed, 57 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 0bebb5c348b8..5aa4a9269547 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_array_seq(struct trace_seq *p,
+  const void *buf, int buf_len,
+  size_t el_size);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 139b5067345b..36afd0ed3458 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,14 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_array
+#define __print_array(array, count, el_size)   \
+   ({  \
+   BUILD_BUG_ON(el_size != 8 && el_size != 16 &&   \
+   el_size != 32 && el_size != 64);\
+   ftrace_print_array_seq(p, array, count, el_size);   \
+   })
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -674,6 +682,7 @@ static inline void ftrace_test_probe_##call(void)   
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b77b9a697619..8955e1da83ce 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -177,6 +177,50 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  size_t el_size)
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = "";
+   void *ptr = (void *)buf;
+
+   trace_seq_putc(p, '{');
+
+   while (ptr < buf + buf_len) {
+   switch (el_size) {
+   case 8:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u8 *)ptr);
+   break;
+   case 16:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u16 *)ptr);
+   break;
+   case 32:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u32 *)ptr);
+   break;
+   case 64:
+   trace_seq_printf(p, "%s0x%llx", prefix,
+*(u64 *)ptr);
+   break;
+   default:
+   trace_seq_printf(p, "BAD SIZE:%lu 0x%x", el_size,
+   *(u8 *)ptr);
+   el_size = 8;
+   }
+   prefix = &quo

[PATCH v3 2/3] tools lib traceevent: factor out allocating and processing args

2015-01-19 Thread Javi Merino

The sequence of allocating the print_arg field, calling process_arg()
and verifying that the next event delimiter is repeated twice in
process_hex() and will also be used for process_int_array().  Factor it
out to a function to avoid writing the same code again and again.

Cc: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Steven Rostedt 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---

This patch wasn't part of v2.  It avoids repeating code in patch 3.

 tools/lib/traceevent/event-parse.c | 77 --
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index cf3a44bf1ec3..9d063b829907 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -2013,6 +2013,38 @@ process_entry(struct event_format *event __maybe_unused, 
struct print_arg *arg,
return EVENT_ERROR;
 }
 
+static int alloc_and_process_arg(struct event_format *event, char *next_token,
+   struct print_arg **print_arg)
+{
+   struct print_arg *field;
+   enum event_type type;
+   char *token;
+   int ret = 0;
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, "%s: not enough memory!", __func__);
+   errno = ENOMEM;
+   return -1;
+   }
+
+   type = process_arg(event, field, &token);
+
+   if (test_type_token(type, token, EVENT_DELIM, next_token)) {
+   errno = EINVAL;
+   ret = -1;
+   free_arg(field);
+   goto out_free_token;
+   }
+
+   *print_arg = field;
+
+out_free_token:
+   free_token(token);
+
+   return ret;
+}
+
 static char *arg_eval (struct print_arg *arg);
 
 static unsigned long long
@@ -2485,49 +2517,20 @@ out_free:
 static enum event_type
 process_hex(struct event_format *event, struct print_arg *arg, char **tok)
 {
-   struct print_arg *field;
-   enum event_type type;
-   char *token = NULL;
-
memset(arg, 0, sizeof(*arg));
arg->type = PRINT_HEX;
 
-   field = alloc_arg();
-   if (!field) {
-   do_warning_event(event, "%s: not enough memory!", __func__);
-   goto out_free;
-   }
-
-   type = process_arg(event, field, &token);
-
-   if (test_type_token(type, token, EVENT_DELIM, ","))
-   goto out_free;
-
-   arg->hex.field = field;
-
-   free_token(token);
-
-   field = alloc_arg();
-   if (!field) {
-   do_warning_event(event, "%s: not enough memory!", __func__);
-   *tok = NULL;
-   return EVENT_ERROR;
-   }
-
-   type = process_arg(event, field, &token);
-
-   if (test_type_token(type, token, EVENT_DELIM, ")"))
-   goto out_free;
+   if (alloc_and_process_arg(event, ",", &arg->hex.field))
+   goto out;
 
-   arg->hex.size = field;
+if (alloc_and_process_arg(event, ")", &arg->hex.size))
+   goto free_field;
 
-   free_token(token);
-   type = read_token_item(tok);
-   return type;
+   return read_token_item(tok);
 
- out_free:
-   free_arg(field);
-   free_token(token);
+free_field:
+   free_arg(arg->hex.field);
+out:
*tok = NULL;
return EVENT_ERROR;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/3] tools lib traceevent: Add support for __print_array()

2015-01-19 Thread Javi Merino

Trace can now generate traces with variable element size arrays.  Add support 
to parse them.

Cc: Namhyung Kim 
Cc: Arnaldo Carvalho de Melo 
Cc: Steven Rostedt 
Cc: Jiri Olsa 
Signed-off-by: Javi Merino 
---

Changes since v2[0]:
  - Avoid repeating the alloc and process of fields in prcoess_int_array()
  - print a warning if the size of the array is not valid

[0] http://thread.gmane.org/gmane.linux.kernel/1867165/focus=1867166

 tools/lib/traceevent/event-parse.c | 91 ++
 tools/lib/traceevent/event-parse.h |  8 
 2 files changed, 99 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 9d063b829907..8626c57cd769 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -757,6 +757,11 @@ static void free_arg(struct print_arg *arg)
free_arg(arg->hex.field);
free_arg(arg->hex.size);
break;
+   case PRINT_INT_ARRAY:
+   free_arg(arg->int_array.field);
+   free_arg(arg->int_array.size);
+   free_arg(arg->int_array.el_size);
+   break;
case PRINT_TYPE:
free(arg->typecast.type);
free_arg(arg->typecast.item);
@@ -2536,6 +2541,32 @@ out:
 }
 
 static enum event_type
+process_int_array(struct event_format *event, struct print_arg *arg, char 
**tok)
+{
+   memset(arg, 0, sizeof(*arg));
+   arg->type = PRINT_INT_ARRAY;
+
+   if (alloc_and_process_arg(event, ",", &arg->int_array.field))
+   goto out;
+
+   if (alloc_and_process_arg(event, ",", &arg->int_array.size))
+   goto free_field;
+
+   if (alloc_and_process_arg(event, ")", &arg->int_array.el_size))
+   goto free_size;
+
+   return read_token_item(tok);
+
+free_size:
+free_arg(arg->int_array.size);
+free_field:
+   free_arg(arg->int_array.field);
+out:
+   *tok = NULL;
+   return EVENT_ERROR;
+}
+
+static enum event_type
 process_dynamic_array(struct event_format *event, struct print_arg *arg, char 
**tok)
 {
struct format_field *field;
@@ -2830,6 +2861,10 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, "__print_array") == 0) {
+   free_token(token);
+   return process_int_array(event, arg, tok);
+   }
if (strcmp(token, "__get_str") == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3358,6 +3393,7 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_INT_ARRAY:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3768,6 +3804,52 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
 
+   case PRINT_INT_ARRAY: {
+   void *num;
+   int el_size;
+
+   if (arg->int_array.field->type == PRINT_DYNAMIC_ARRAY) {
+   unsigned long offset;
+   offset = pevent_read_number(pevent,
+   data + 
arg->int_array.field->dynarray.field->offset,
+   arg->int_array.field->dynarray.field->size);
+   num = data + (offset & 0x);
+   } else {
+   field = arg->int_array.field->field.field;
+   if (!field) {
+   str = arg->int_array.field->field.name;
+   field = pevent_find_any_field(event, str);
+   if (!field)
+   goto out_warning_field;
+   arg->int_array.field->field.field = field;
+   }
+   num = data + field->offset;
+   }
+   len = eval_num_arg(data, size, event, arg->int_array.size);
+   el_size = eval_num_arg(data, size, event, 
arg->int_array.el_size);
+   el_size /= 8;
+   for (i = 0; i < len; i++) {
+   if (i)
+   trace_seq_putc(s, ' ');
+
+   if (el_size == 1) {
+   trace_seq_printf(s, "%u", *(uint8_t *)num);
+   } else if (el_size == 2) {
+   trace_seq_printf(s, "%u", *(uint16_t *)num);
+   } else if (el_size == 4) {
+   trace_seq_printf

Re: [PATCH v4 1/3] tracing: Add array printing helpers

2015-01-28 Thread Javi Merino

On Wed, Jan 28, 2015 at 03:35:57AM +, Steven Rostedt wrote:
> On Mon, 26 Jan 2015 12:11:49 +
> Javi Merino  wrote:
> 
> > From: Dave Martin 
> > 
> > If a trace event contains an array, there is currently no standard
> > way to format this for text output.  Drivers are currently hacking
> > around this by a) local hacks that use the trace_seq functionailty
> > directly, or b) just not printing that information.  For fixed size
> > arrays, formatting of the elements can be open-coded, but this gets
> > cumbersome for arrays of non-trivial size.
> > 
> > These approaches result in non-standard content of the event format
> > description delivered to userspace, so userland tools needs to be
> > taught to understand and parse each array printing method
> > individually.
> > 
> > This patch implements common __print__array() helpers that
> > tracepoint implementations can use instead of reinventing them.  A
> > simple C-style syntax is used to delimit the array and its elements
> > {like,this}.
> > 
> > So that the helpers can be used with large static arrays as well as
> > dynamic arrays, they take a pointer and element count: they can be
> > used with __get_dynamic_array() for use with dynamic arrays.
> > 
> > Cc: Steven Rostedt 
> > Cc: Ingo Molnar 
> > Signed-off-by: Dave Martin 
> > Signed-off-by: Javi Merino 
> > ---
> >  include/linux/ftrace_event.h |  4 
> >  include/trace/ftrace.h   |  9 +
> >  kernel/trace/trace_output.c  | 44 
> > 
> >  3 files changed, 57 insertions(+)
> > 
> > diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> > index 0bebb5c348b8..5aa4a9269547 100644
> > --- a/include/linux/ftrace_event.h
> > +++ b/include/linux/ftrace_event.h
> > @@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq 
> > *p, void *bitmask_ptr,
> >  const char *ftrace_print_hex_seq(struct trace_seq *p,
> >  const unsigned char *buf, int len);
> >  
> > +const char *ftrace_print_array_seq(struct trace_seq *p,
> > +  const void *buf, int buf_len,
> > +  size_t el_size);
> > +
> >  struct trace_iterator;
> >  struct trace_event;
> >  
> > diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> > index 139b5067345b..36afd0ed3458 100644
> > --- a/include/trace/ftrace.h
> > +++ b/include/trace/ftrace.h
> > @@ -263,6 +263,14 @@
> >  #undef __print_hex
> >  #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
> >  
> > +#undef __print_array
> > +#define __print_array(array, count, el_size)   
> > \
> > +   ({  \
> > +   BUILD_BUG_ON(el_size != 8 && el_size != 16 &&   \
> > +   el_size != 32 && el_size != 64);\
> 
> I tried testing this patch by writing a print_array myself, and I kept
> hitting this BUILD_BUG_ON, and was wondering WTF? Then it dawned on me.
> 
> el_size should not be based on bits, it should be based on bytes. I
> passed in "sizeof()" which doesn't work with bits.
> 
> Please update to "el_size < 1 || el_size > 8".
> 
> and adjust the rest accordingly.

Done, I'm testing it now.  I'll send a v5 later today.

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 1/3] tracing: Add array printing helpers

2015-01-28 Thread Javi Merino

On Wed, Jan 28, 2015 at 11:26:09AM +, Javi Merino wrote:
> On Wed, Jan 28, 2015 at 03:35:57AM +, Steven Rostedt wrote:
> > On Mon, 26 Jan 2015 12:11:49 +
> > Javi Merino  wrote:
> > 
> > > From: Dave Martin 
> > > 
> > > If a trace event contains an array, there is currently no standard
> > > way to format this for text output.  Drivers are currently hacking
> > > around this by a) local hacks that use the trace_seq functionailty
> > > directly, or b) just not printing that information.  For fixed size
> > > arrays, formatting of the elements can be open-coded, but this gets
> > > cumbersome for arrays of non-trivial size.
> > > 
> > > These approaches result in non-standard content of the event format
> > > description delivered to userspace, so userland tools needs to be
> > > taught to understand and parse each array printing method
> > > individually.
> > > 
> > > This patch implements common __print__array() helpers that
> > > tracepoint implementations can use instead of reinventing them.  A
> > > simple C-style syntax is used to delimit the array and its elements
> > > {like,this}.
> > > 
> > > So that the helpers can be used with large static arrays as well as
> > > dynamic arrays, they take a pointer and element count: they can be
> > > used with __get_dynamic_array() for use with dynamic arrays.
> > > 
> > > Cc: Steven Rostedt 
> > > Cc: Ingo Molnar 
> > > Signed-off-by: Dave Martin 
> > > Signed-off-by: Javi Merino 
> > > ---
> > >  include/linux/ftrace_event.h |  4 
> > >  include/trace/ftrace.h   |  9 +
> > >  kernel/trace/trace_output.c  | 44 
> > > 
> > >  3 files changed, 57 insertions(+)
> > > 
> > > diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> > > index 0bebb5c348b8..5aa4a9269547 100644
> > > --- a/include/linux/ftrace_event.h
> > > +++ b/include/linux/ftrace_event.h
> > > @@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq 
> > > *p, void *bitmask_ptr,
> > >  const char *ftrace_print_hex_seq(struct trace_seq *p,
> > >const unsigned char *buf, int len);
> > >  
> > > +const char *ftrace_print_array_seq(struct trace_seq *p,
> > > +const void *buf, int buf_len,
> > > +size_t el_size);
> > > +
> > >  struct trace_iterator;
> > >  struct trace_event;
> > >  
> > > diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> > > index 139b5067345b..36afd0ed3458 100644
> > > --- a/include/trace/ftrace.h
> > > +++ b/include/trace/ftrace.h
> > > @@ -263,6 +263,14 @@
> > >  #undef __print_hex
> > >  #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
> > >  
> > > +#undef __print_array
> > > +#define __print_array(array, count, el_size) 
> > > \
> > > + ({  \
> > > + BUILD_BUG_ON(el_size != 8 && el_size != 16 &&   \
> > > + el_size != 32 && el_size != 64);\
> > 
> > I tried testing this patch by writing a print_array myself, and I kept
> > hitting this BUILD_BUG_ON, and was wondering WTF? Then it dawned on me.
> > 
> > el_size should not be based on bits, it should be based on bytes. I
> > passed in "sizeof()" which doesn't work with bits.

Ugh.  If you use sizeof() in print_array() then trace-cmd won't be
able to parse it since it doesn't have an implementation of sizeof().

  [thermal_power_allocator:thermal_power_allocator] function sizeof not defined
  Error: expected type 5 but read 0
*** Error in `./trace-cmd': double free or corruption (fasttop): 
0x00e04980 ***

Implementing a fully functional sizeof() in trace-cmd will be a huge
beast, I guess you will have to limit it to only some types.

> > Please update to "el_size < 1 || el_size > 8".
> > 
> > and adjust the rest accordingly.
> 
> Done, I'm testing it now.  I'll send a v5 later today.
> 
> Cheers,
> Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v1 2/7] thermal: extend the cooling device API to include power information

2015-01-28 Thread Javi Merino

Add three optional callbacks to the cooling device interface to allow
them to express power.  In addition to the callbacks, add helpers to
identify cooling devices that implement the power cooling device API.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 drivers/thermal/thermal_core.c | 52 ++
 include/linux/thermal.h| 18 +++
 2 files changed, 70 insertions(+)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index bf230c64e016..a01d4a72bd93 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -868,6 +868,58 @@ emul_temp_store(struct device *dev, struct 
device_attribute *attr,
 static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 #endif/*CONFIG_THERMAL_EMULATION*/
 
+/**
+ * power_actor_get_max_power() - get the maximum power that a cdev can consume
+ * @cdev:  pointer to &thermal_cooling_device
+ * @tz:a valid thermal zone device pointer
+ * @max_power: pointer in which to store the maximum power
+ *
+ * Calculate the maximum power consumption in milliwats that the
+ * cooling device can currently consume and store it in @max_power.
+ *
+ * Return: 0 on success, -EINVAL if @cdev doesn't support the
+ * power_actor API or -E* on other error.
+ */
+int power_actor_get_max_power(struct thermal_cooling_device *cdev,
+ struct thermal_zone_device *tz, u32 *max_power)
+{
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   return cdev->ops->state2power(cdev, tz, 0, max_power);
+}
+
+/**
+ * power_actor_set_power() - limit the maximum power that a cooling device can 
consume
+ * @cdev:  pointer to &thermal_cooling_device
+ * @instance:  thermal instance to update
+ * @power: the power in milliwatts
+ *
+ * Set the cooling device to consume at most @power milliwatts.
+ *
+ * Return: 0 on success, -EINVAL if the cooling device does not
+ * implement the power actor API or -E* for other failures.
+ */
+int power_actor_set_power(struct thermal_cooling_device *cdev,
+ struct thermal_instance *instance, u32 power)
+{
+   unsigned long state;
+   int ret;
+
+   if (!cdev_is_power_actor(cdev))
+   return -EINVAL;
+
+   ret = cdev->ops->power2state(cdev, instance->tz, power, &state);
+   if (ret)
+   return ret;
+
+   instance->target = state;
+   cdev->updated = false;
+   thermal_cdev_update(cdev);
+
+   return 0;
+}
+
 static DEVICE_ATTR(type, 0444, type_show, NULL);
 static DEVICE_ATTR(temp, 0444, temp_show, NULL);
 static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 03dec86abc79..288ac6fd743d 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -60,6 +60,7 @@
 
 struct thermal_zone_device;
 struct thermal_cooling_device;
+struct thermal_instance;
 
 enum thermal_device_mode {
THERMAL_DEVICE_DISABLED = 0,
@@ -113,6 +114,12 @@ struct thermal_cooling_device_ops {
int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
+   int (*get_requested_power)(struct thermal_cooling_device *,
+  struct thermal_zone_device *, u32 *);
+   int (*state2power)(struct thermal_cooling_device *,
+  struct thermal_zone_device *, unsigned long, u32 *);
+   int (*power2state)(struct thermal_cooling_device *,
+  struct thermal_zone_device *, u32, unsigned long *);
 };
 
 struct thermal_cooling_device {
@@ -323,6 +330,17 @@ void thermal_zone_of_sensor_unregister(struct device *dev,
 }
 
 #endif
+
+static inline bool cdev_is_power_actor(struct thermal_cooling_device *cdev)
+{
+   return cdev->ops->get_requested_power && cdev->ops->state2power &&
+   cdev->ops->power2state;
+}
+
+int power_actor_get_max_power(struct thermal_cooling_device *,
+ struct thermal_zone_device *tz, u32 *max_power);
+int power_actor_set_power(struct thermal_cooling_device *,
+ struct thermal_instance *, u32);
 struct thermal_zone_device *thermal_zone_device_register(const char *, int, 
int,
void *, struct thermal_zone_device_ops *,
const struct thermal_zone_params *, int, int);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v1 0/7] The power allocator thermal governor

2015-01-28 Thread Javi Merino

Hi linux-pm,

The power allocator governor allocates device power to control
temperature.  This requires transforming performance requests into
requested power, which we do with an extended cooling device API
introduced in patch 2 (thermal: extend the cooling device API to
include power information).  Patch 3 (thermal: cpu_cooling: implement
the power cooling device API) extends the cpu cooling device using a
simple power model.

This series depend on the array printing helper v5[0] and the weight
fixes[1].  Rui said that he had applied patch 1 (thermal: let
governors have private data for each thermal zone) but I can't see it
in his tree, so I'm reposting it.

[0] http://mid.gmane.org/1422449335-8289-1-git-send-email-javi.merino%40arm.com
[1] http://thread.gmane.org/gmane.linux.power-management.general/55730

Changes since RFC v6:
  - Addressed Eduardo's review
+ Pass the interval to the static power function as suggested by
  Eduardo
+ Make the cooling device ops return 0 or -E* and put the
  calculation in a parameter, like the rest of the cooling device
  ops
+ Documentation improvements
  - Use thermal_cdev_update() to change cooling device states
  - Add a patch to export the power allocator governor's tzp
parameters to sysfs

Changes since RFC v5:
  - Addressed Stephen's review of the trace patches.
  - Removed power actors and extended the cooling device interface
instead.
  - Let platforms override the power allocator governor parameters in
their thermal zone parameters

Changes since RFC v4:
  - Add more tracing
  - Document some of the limitations of the power allocator governor
  - Export the power_actor API and move power_actor.h to include/linux

Changes since RFC v3:
  - Use tz->passive to poll faster when the first trip point is hit.
  - Don't make a special directory for power_actors
  - Add a DT property for sustainable-power
  - Simplify the static power interface and pass the current thermal
zone in every power_actor_ops to remove the controversial
enum power_actor_types
  - Use locks with the actor_list list
  - Use cpufreq_get() to get the frequency of the cpu instead of
using the notifiers.
  - Remove the prompt for THERMAL_POWER_ACTOR_CPU when configuring
the kernel

Changes since RFC v2:
  - Changed the PI controller into a PID controller
  - Added static power to the cpu power model
  - tz parameter max_dissipatable_power renamed to sustainable_power
  - Register the cpufreq cooling device as part of the
power_cpu_actor registration.

Changes since RFC v1:
  - Fixed finding cpufreq cooling devices in cpufreq_frequency_change()
  - Replaced the cooling device interface with a separate power actor
API
  - Addressed most of Eduardo's comments
  - Incorporated ftrace support for bitmask to trace cpumasks

Cheers,
Javi & Punit

Javi Merino (6):
  thermal: let governors have private data for each thermal zone
  thermal: extend the cooling device API to include power information
  thermal: cpu_cooling: implement the power cooling device API
  thermal: introduce the Power Allocator governor
  thermal: add trace events to the power allocator governor
  thermal: export thermal_zone_parameters to sysfs

Punit Agrawal (1):
  of: thermal: Introduce sustainable power for a thermal zone

 .../devicetree/bindings/thermal/thermal.txt|   9 +
 Documentation/thermal/cpu-cooling-api.txt  | 156 ++-
 Documentation/thermal/power_allocator.txt  | 241 ++
 Documentation/thermal/sysfs-api.txt|  52 +++
 drivers/thermal/Kconfig|  15 +
 drivers/thermal/Makefile   |   1 +
 drivers/thermal/cpu_cooling.c  | 507 -
 drivers/thermal/of-thermal.c   |   4 +
 drivers/thermal/power_allocator.c  | 496 
 drivers/thermal/thermal_core.c | 254 ++-
 drivers/thermal/thermal_core.h |   8 +
 include/linux/cpu_cooling.h|  39 ++
 include/linux/thermal.h|  64 ++-
 include/trace/events/thermal.h |  58 +++
 include/trace/events/thermal_power_allocator.h |  80 
 15 files changed, 1964 insertions(+), 20 deletions(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c
 create mode 100644 include/trace/events/thermal_power_allocator.h

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v1 6/7] of: thermal: Introduce sustainable power for a thermal zone

2015-01-28 Thread Javi Merino

From: Punit Agrawal 

Introduce an optional property called, sustainable-power, which
represents the power (in mW) which the thermal zone can safely
dissipate.

If provided the property is parsed and associated with the thermal
zone via the thermal zone parameters.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
---
 Documentation/devicetree/bindings/thermal/thermal.txt | 9 +
 drivers/thermal/of-thermal.c  | 4 
 2 files changed, 13 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt 
b/Documentation/devicetree/bindings/thermal/thermal.txt
index f5db6b72a36f..99d6608c9d5f 100644
--- a/Documentation/devicetree/bindings/thermal/thermal.txt
+++ b/Documentation/devicetree/bindings/thermal/thermal.txt
@@ -167,6 +167,13 @@ Optional property:
by means of sensor ID. Additional coefficients are
interpreted as constant offset.
 
+- sustainable-power:   An estimate of the sustainable power (in mW) that the
+  Type: unsigned   thermal zone can dissipate at the desired
+  Size: one cell   control temperature.  For reference, the
+   sustainable power of a 4'' phone is typically
+   2000mW, while on a 10'' tablet is around
+   4500mW.
+
 Note: The delay properties are bound to the maximum dT/dt (temperature
 derivative over time) in two situations for a thermal zone:
 (i)  - when passive cooling is activated (polling-delay-passive); and
@@ -546,6 +553,8 @@ thermal-zones {
 */
coefficients =  <1200   -345890>;
 
+   sustainable-power = <2500>;
+
trips {
/* Trips are based on resulting linear equation */
cpu-trip: cpu-trip {
diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index d717f3dab6f1..b44296541938 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -862,6 +862,7 @@ int __init of_parse_thermal_zones(void)
for_each_child_of_node(np, child) {
struct thermal_zone_device *zone;
struct thermal_zone_params *tzp;
+   u32 prop;
 
/* Check whether child is enabled or not */
if (!of_device_is_available(child))
@@ -888,6 +889,9 @@ int __init of_parse_thermal_zones(void)
/* No hwmon because there might be hwmon drivers registering */
tzp->no_hwmon = true;
 
+   if (!of_property_read_u32(child, "sustainable-power", &prop))
+   tzp->sustainable_power = prop;
+
zone = thermal_zone_device_register(child->name, tz->ntrips,
0, tz,
ops, tzp,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v1 5/7] thermal: add trace events to the power allocator governor

2015-01-28 Thread Javi Merino

Add trace events for the power allocator governor and the power actor
interface of the cpu cooling device.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Signed-off-by: Javi Merino 
---
 drivers/thermal/cpu_cooling.c  | 31 +-
 drivers/thermal/power_allocator.c  | 22 ++-
 include/trace/events/thermal.h | 58 +++
 include/trace/events/thermal_power_allocator.h | 80 ++
 4 files changed, 187 insertions(+), 4 deletions(-)
 create mode 100644 include/trace/events/thermal_power_allocator.h

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index a639aaf228f5..1ca8fe580721 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -31,6 +31,8 @@
 #include 
 #include 
 
+#include 
+
 /*
  * Cooling state <-> CPUFreq frequency
  *
@@ -534,12 +536,20 @@ static int cpufreq_get_requested_power(struct 
thermal_cooling_device *cdev,
   u32 *power)
 {
unsigned long freq;
-   int cpu, ret;
+   int i = 0, cpu, ret;
u32 static_power, dynamic_power, total_load = 0;
struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+   u32 *load_cpu = NULL;
 
freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus));
 
+   if (trace_thermal_power_cpu_get_power_enabled()) {
+   u32 ncpus = cpumask_weight(&cpufreq_device->allowed_cpus);
+
+   load_cpu = devm_kcalloc(&cdev->device, ncpus, sizeof(*load_cpu),
+   GFP_KERNEL);
+   }
+
for_each_cpu(cpu, &cpufreq_device->allowed_cpus) {
u32 load;
 
@@ -549,14 +559,29 @@ static int cpufreq_get_requested_power(struct 
thermal_cooling_device *cdev,
load = 0;
 
total_load += load;
+   if (trace_thermal_power_cpu_limit_enabled() && load_cpu)
+   load_cpu[i] = load;
+
+   i++;
}
 
cpufreq_device->last_load = total_load;
 
dynamic_power = get_dynamic_power(cpufreq_device, freq);
ret = get_static_power(cpufreq_device, tz, freq, &static_power);
-   if (ret)
+   if (ret) {
+   if (load_cpu)
+   devm_kfree(&cdev->device, load_cpu);
return ret;
+   }
+
+   if (trace_thermal_power_cpu_limit_enabled() && load_cpu) {
+   trace_thermal_power_cpu_get_power(
+   &cpufreq_device->allowed_cpus,
+   freq, load_cpu, i, dynamic_power, static_power);
+
+   devm_kfree(&cdev->device, load_cpu);
+   }
 
*power = static_power + dynamic_power;
return 0;
@@ -664,6 +689,8 @@ static int cpufreq_power2state(struct 
thermal_cooling_device *cdev,
return -EINVAL;
}
 
+   trace_thermal_power_cpu_limit(&cpufreq_device->allowed_cpus,
+ target_freq, *state, power);
return 0;
 }
 
diff --git a/drivers/thermal/power_allocator.c 
b/drivers/thermal/power_allocator.c
index c929143aee67..34c9a9025c54 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -19,6 +19,9 @@
 #include 
 #include 
 
+#define CREATE_TRACE_POINTS
+#include 
+
 #include "thermal_core.h"
 
 #define FRAC_BITS 10
@@ -124,7 +127,14 @@ static u32 pid_controller(struct thermal_zone_device *tz,
/* feed-forward the known sustainable dissipatable power */
power_range = tz->tzp->sustainable_power + frac_to_int(power_range);
 
-   return clamp(power_range, (s64)0, (s64)max_allocatable_power);
+   power_range = clamp(power_range, (s64)0, (s64)max_allocatable_power);
+
+   trace_thermal_power_allocator_pid(frac_to_int(err),
+ frac_to_int(params->err_integral),
+ frac_to_int(p), frac_to_int(i),
+ frac_to_int(d), power_range);
+
+   return power_range;
 }
 
 /**
@@ -201,7 +211,7 @@ static int allocate_power(struct thermal_zone_device *tz,
struct thermal_instance *instance;
u32 *req_power, *max_power, *granted_power;
u32 total_req_power, max_allocatable_power;
-   u32 power_range;
+   u32 total_granted_power, power_range;
int i, num_actors, ret = 0;
 
mutex_lock(&tz->lock);
@@ -266,6 +276,7 @@ static int allocate_power(struct thermal_zone_device *tz,
divvy_up_power(req_power, max_power, num_actors, total_req_power,
   power_range, granted_power);
 
+   total_granted_power = 0;
i = 0;
list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
if (insta

[PATCH v1 7/7] thermal: export thermal_zone_parameters to sysfs

2015-01-28 Thread Javi Merino

It's useful for tuning to be able to edit thermal_zone_parameters from
userspace.  Export them to the thermal_zone sysfs so that they can be
easily changed.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/sysfs-api.txt |  52 +
 drivers/thermal/thermal_core.c  | 110 
 2 files changed, 162 insertions(+)

diff --git a/Documentation/thermal/sysfs-api.txt 
b/Documentation/thermal/sysfs-api.txt
index 87519cb379ee..a95aabfad014 100644
--- a/Documentation/thermal/sysfs-api.txt
+++ b/Documentation/thermal/sysfs-api.txt
@@ -176,6 +176,12 @@ Thermal zone device sys I/F, created once it's registered:
 |---trip_point_[0-*]_type: Trip point type
 |---trip_point_[0-*]_hyst: Hysteresis value for this trip point
 |---emul_temp: Emulated temperature set node
+|---sustainable_power:  Sustainable dissipatable power
+|---k_po:   Proportional term during temperature overshoot
+|---k_pu:   Proportional term during temperature undershoot
+|---k_i:PID's integral term in the power allocator gov
+|---k_d:PID's derivative term in the power allocator
+|---integral_cutoff:Offset above which errors are accumulated
 
 Thermal cooling device sys I/F, created once it's registered:
 /sys/class/thermal/cooling_device[0-*]:
@@ -289,6 +295,52 @@ emul_temp
  because userland can easily disable the thermal policy by simply
  flooding this sysfs node with low temperature values.
 
+sustainable_power
+   An estimate of the sustained power that can be dissipated by
+   the thermal zone. Used by the power allocator governor. For
+   more information see Documentation/thermal/power_allocator.txt
+   Unit: milliwatts
+   RW, Optional
+
+k_po
+   The proportional term of the power allocator governor's PID
+   controller during temperature overshoot. Temperature overshoot
+   is when the current temperature is above the "desired
+   temperature" trip point. For more information see
+   Documentation/thermal/power_allocator.txt
+   RW, Optional
+
+k_pu
+   The proportional term of the power allocator governor's PID
+   controller during temperature undershoot. Temperature undershoot
+   is when the current temperature is below the "desired
+   temperature" trip point. For more information see
+   Documentation/thermal/power_allocator.txt
+   RW, Optional
+
+k_i
+   The integral term of the power allocator governor's PID
+   controller. This term allows the PID controller to compensate
+   for long term drift. For more information see
+   Documentation/thermal/power_allocator.txt
+   RW, Optional
+
+k_d
+   The derivative term of the power allocator governor's PID
+   controller. For more information see
+   Documentation/thermal/power_allocator.txt
+   RW, Optional
+
+integral_cutoff
+   Temperature offset from the desired temperature trip point
+   above which the integral term of the power allocator
+   governor's PID controller starts accumulating errors. For
+   example, if integral_cutoff is 0, then the integral term only
+   accumulates error when temperature is above the desired
+   temperature trip point. For more information see
+   Documentation/thermal/power_allocator.txt
+   RW, Optional
+
 *
 * Cooling device attributes *
 *
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index b77b5416929c..bde5e0442321 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -868,6 +868,111 @@ emul_temp_store(struct device *dev, struct 
device_attribute *attr,
 static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
 #endif/*CONFIG_THERMAL_EMULATION*/
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
+
+static ssize_t
+sustainable_power_show(struct device *dev, struct device_attribute *devattr,
+  char *buf)
+{
+   struct thermal_zone_device *tz = to_thermal_zone(dev);
+
+   if (tz->tzp)
+   return sprintf(buf, "%u\n", tz->tzp->sustainable_power);
+   else
+   return -EIO;
+}
+
+static ssize_t
+sustainable_power_store(struct device *dev, struct device_attribute *devattr,
+   const char *buf, size_t count)
+{
+   struct thermal_zone_device *tz = to_thermal_zone(dev);
+   u32 sustainable_power;
+
+   if (!tz->tzp)
+   return -EIO;
+
+   if (kstrtou32(buf, 10, &sustainable_power))
+   return -EINVAL;
+
+   tz->tzp->sustainable_power = sustainable_power;
+
+   return count;
+}
+static DEVICE_ATTR(sustainable_power, S_IWUSR | S_IRUGO,

[PATCH v1 4/7] thermal: introduce the Power Allocator governor

2015-01-28 Thread Javi Merino

The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on "power actors", entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/power_allocator.txt | 241 +++
 drivers/thermal/Kconfig   |  15 +
 drivers/thermal/Makefile  |   1 +
 drivers/thermal/power_allocator.c | 478 ++
 drivers/thermal/thermal_core.c|   9 +-
 drivers/thermal/thermal_core.h|   8 +
 include/linux/thermal.h   |  37 ++-
 7 files changed, 782 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/thermal/power_allocator.txt
 create mode 100644 drivers/thermal/power_allocator.c

diff --git a/Documentation/thermal/power_allocator.txt 
b/Documentation/thermal/power_allocator.txt
new file mode 100644
index ..c9604e76c544
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,241 @@
+Power allocator governor tunables
+=
+
+Trip points
+---
+
+The governor requires the following two passive trip points:
+
+1.  "switch on" trip point: temperature above which the governor
+control loop starts operating.
+2.  "desired temperature" trip point: it should be higher than the
+"switch on" trip point.  This the target temperature the governor
+is controlling for.
+
+PID Controller
+--
+
+The power allocator governor implements a
+Proportional-Integral-Derivative controller (PID controller) with
+temperature as the control input and power as the controlled output:
+
+P_max = k_p * e + k_i * err_integral + k_d * diff_err + sustainable_power
+
+where
+e = desired_temperature - current_temperature
+err_integral is the sum of previous errors
+diff_err = e - previous_error
+
+It is similar to the one depicted below:
+
+  k_d
+   |
+current_temp   |
+ | v
+ |+--+   +---+
+ | +->| diff_err |-->| X |--+
+ | |  +--+   +---+  |
+ | ||  tdpactor
+ | |  k_i   |   |  
get_requested_power()
+ | |   ||   || |
+ | |   ||   || | ...
+ v |   vv   vv v
+   +---+   |  +---+  +---++---+   +---+   +--+
+   | S |---+->| sum e |->| X |--->| S |-->| S |-->|power |
+   +---+   |  +---+  +---++---+   +---+   |allocation|
+ ^ |^ +--+
+ | ||| |
+ | |+---+   || |
+ | +--->| X |---+v v
+ |  +---+   granted performance
+desired_temperature   ^
+  |
+  |
+  k_po/k_pu
+
+Sustainable power
+-
+
+An estimate of the sustainable dissipatable power (in mW) should be
+provided while registering the thermal zone.  This estimates the
+sustained power that can be dissipated at the desired control
+temperature.  This is the maximum sustained power for allocation at
+the desired maximum temperature.  The actual sustained power can vary
+for a number of reasons.  The closed loop controller will take care of
+variations such as environmental conditions, and some factors related
+to the speed-grade of the silicon.  `sustainable_power` is therefore
+simply an estimate, and may be tuned to affect the aggressiveness of
+the thermal ramp. For reference, the sustainable power of a 4" phone
+is typically 2

[PATCH v1 1/7] thermal: let governors have private data for each thermal zone

2015-01-28 Thread Javi Merino

A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---

Hi Rui,

You said[0] that you had applied it but I can't see it in any of your trees, so 
I'm sending it again.

[0] http://thread.gmane.org/gmane.linux.kernel/1845418/focus=54163
---
 drivers/thermal/thermal_core.c | 83 ++
 include/linux/thermal.h|  9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 87e0b0782023..bf230c64e016 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -75,6 +75,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor() - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+  const char *failed_gov_name)
+{
+   if (tz->governor && tz->governor->bind_to_tz) {
+   if (tz->governor->bind_to_tz(tz)) {
+   dev_err(&tz->device,
+   "governor %s failed to bind and the previous 
one (%s) failed to bind again, thermal zone %s has no governor\n",
+   failed_gov_name, tz->governor->name, tz->type);
+   tz->governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Return: 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz->governor && tz->governor->unbind_from_tz)
+   tz->governor->unbind_from_tz(tz);
+
+   if (new_gov && new_gov->bind_to_tz) {
+   ret = new_gov->bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov->name);
+
+   return ret;
+   }
+   }
+
+   tz->governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -107,8 +159,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos->tzp->governor_name;
 
-   if (!strncasecmp(name, governor->name, THERMAL_NAME_LENGTH))
-   pos->governor = governor;
+   if (!strncasecmp(name, governor->name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_err(&pos->device,
+   "Failed to set governor %s for thermal 
zone %s: %d\n",
+   governor->name, pos->type, ret);
+   }
}
 
mutex_unlock(&thermal_list_lock);
@@ -134,7 +193,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, &thermal_tz_list, node) {
if (!strncasecmp(pos->governor->name, governor->name,
THERMAL_NAME_LENGTH))
-   pos->governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(&thermal_list_lock);
@@ -763,8 +822,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz->governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(&tz->lock);
@@ -1463,6 +1523,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   st

[PATCH v1 3/7] thermal: cpu_cooling: implement the power cooling device API

2015-01-28 Thread Javi Merino

Add a basic power model to the cpu cooling device to implement the
power cooling device API.  The power model uses the current frequency,
current load and OPPs for the power calculations.  The cpus must have
registered their OPPs using the OPP library.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/cpu-cooling-api.txt | 156 +-
 drivers/thermal/cpu_cooling.c | 480 +-
 include/linux/cpu_cooling.h   |  39 +++
 3 files changed, 670 insertions(+), 5 deletions(-)

diff --git a/Documentation/thermal/cpu-cooling-api.txt 
b/Documentation/thermal/cpu-cooling-api.txt
index 753e47cc2e20..71653584cd03 100644
--- a/Documentation/thermal/cpu-cooling-api.txt
+++ b/Documentation/thermal/cpu-cooling-api.txt
@@ -36,8 +36,162 @@ the user. The registration APIs returns the cooling device 
pointer.
 np: pointer to the cooling device device tree node
 clip_cpus: cpumask of cpus where the frequency constraints will happen.
 
-1.1.3 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
+1.1.3 struct thermal_cooling_device *cpufreq_power_cooling_register(
+const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_cooling_register, this function registers a cpufreq
+cooling device.  Using this function, the cooling device will
+implement the power extensions by using a simple cpu power model.  The
+cpus must have registered their OPPs using the OPP library.
+
+The additional parameters are needed for the power model (See 2. Power
+models).  "capacitance" is the dynamic power coefficient (See 2.1
+Dynamic power).  "plat_static_func" is a function to calculate the
+static power consumed by these cpus (See 2.2 Static power).
+
+1.1.4 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
+struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance,
+get_static_t plat_static_func)
+
+Similar to cpufreq_power_cooling_register, this function register a
+cpufreq cooling device with power extensions using the device tree
+information supplied by the np parameter.
+
+1.1.5 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
 This interface function unregisters the "thermal-cpufreq-%x" cooling 
device.
 
 cdev: Cooling device pointer which has to be unregistered.
+
+2. Power models
+
+The power API registration functions provide a simple power model for
+CPUs.  The current power is calculated as dynamic + (optionally)
+static power.  This power model requires that the operating-points of
+the CPUs are registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using CONFIG_CPUFREQ_DT then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
+
+The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
+and `of_cpufreq_power_cooling_register()` is optional.  If you don't
+provide it, only dynamic power will be considered.
+
+2.1 Dynamic power
+
+The dynamic power consumption of a processor depends on many factors.
+For a given processor implementation the primary factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Capacitance * Voltage^2 * Frequency * Utilisation
+
+Where `capacitance` is a constant that represents an indicative
+running time dynamic power coefficient in fundamental units of
+mW/MHz/uVolt^2.  Typical values for mobile CPUs might lie in range
+from 100 to 500.  For reference, the approx

[PATCH v5 1/3] tracing: Add array printing helper

2015-01-28 Thread Javi Merino

From: Dave Martin 

If a trace event contains an array, there is currently no standard
way to format this for text output.  Drivers are currently hacking
around this by a) local hacks that use the trace_seq functionailty
directly, or b) just not printing that information.  For fixed size
arrays, formatting of the elements can be open-coded, but this gets
cumbersome for arrays of non-trivial size.

These approaches result in non-standard content of the event format
description delivered to userspace, so userland tools needs to be
taught to understand and parse each array printing method
individually.

This patch implements a __print_array() helper that tracepoint
implementations can use instead of reinventing it.  A simple C-style
syntax is used to delimit the array and its elements {like,this}.

So that the helpers can be used with large static arrays as well as
dynamic arrays, they take a pointer and element count: they can be
used with __get_dynamic_array() for use with dynamic arrays.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Signed-off-by: Dave Martin 
Signed-off-by: Javi Merino 
---
 include/linux/ftrace_event.h |  4 
 include/trace/ftrace.h   |  9 +
 kernel/trace/trace_output.c  | 44 
 3 files changed, 57 insertions(+)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 0bebb5c348b8..5aa4a9269547 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -44,6 +44,10 @@ const char *ftrace_print_bitmask_seq(struct trace_seq *p, 
void *bitmask_ptr,
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
+const char *ftrace_print_array_seq(struct trace_seq *p,
+  const void *buf, int buf_len,
+  size_t el_size);
+
 struct trace_iterator;
 struct trace_event;
 
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 139b5067345b..304901fc5f34 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -263,6 +263,14 @@
 #undef __print_hex
 #define __print_hex(buf, buf_len) ftrace_print_hex_seq(p, buf, buf_len)
 
+#undef __print_array
+#define __print_array(array, count, el_size)   \
+   ({  \
+   BUILD_BUG_ON(el_size != 1 && el_size != 2 &&\
+el_size != 4 && el_size != 8); \
+   ftrace_print_array_seq(p, array, count, el_size);   \
+   })
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
 static notrace enum print_line_t   \
@@ -674,6 +682,7 @@ static inline void ftrace_test_probe_##call(void)   
\
 #undef __get_dynamic_array_len
 #undef __get_str
 #undef __get_bitmask
+#undef __print_array
 
 #undef TP_printk
 #define TP_printk(fmt, args...) "\"" fmt "\", "  __stringify(args)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b77b9a697619..692bf7184c8c 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -177,6 +177,50 @@ ftrace_print_hex_seq(struct trace_seq *p, const unsigned 
char *buf, int buf_len)
 }
 EXPORT_SYMBOL(ftrace_print_hex_seq);
 
+const char *
+ftrace_print_array_seq(struct trace_seq *p, const void *buf, int buf_len,
+  size_t el_size)
+{
+   const char *ret = trace_seq_buffer_ptr(p);
+   const char *prefix = "";
+   void *ptr = (void *)buf;
+
+   trace_seq_putc(p, '{');
+
+   while (ptr < buf + buf_len) {
+   switch (el_size) {
+   case 1:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u8 *)ptr);
+   break;
+   case 2:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u16 *)ptr);
+   break;
+   case 4:
+   trace_seq_printf(p, "%s0x%x", prefix,
+*(u32 *)ptr);
+   break;
+   case 8:
+   trace_seq_printf(p, "%s0x%llx", prefix,
+*(u64 *)ptr);
+   break;
+   default:
+   trace_seq_printf(p, "BAD SIZE:%zu 0x%x", el_size,
+*(u8 *)ptr);
+   el_size = 1;
+   }
+   prefix = ",";
+   ptr += el_size;
+   }
+
+   trace_seq_putc(p, '}');
+   trace_seq_putc(p, 0);
+
+   return ret;
+}
+EXPORT_SYMBOL(ftrace_print_array_seq);
+
 int f

[PATCH 0/4] Consolidate DIV_ROUND_CLOSEST_ULL()

2015-03-20 Thread Javi Merino

The kernel has grown a number of different implementations of
DIV_ROUND_CLOSEST_ULL().  That is, a macro that does the same as
DIV_ROUND_CLOSEST() but with the first operand being an unsigned long
long.  That means that you have to do the division using do_div()
instead of using the C division operator '/'.

This series move the implementation in
drivers/gpu/drm/i915/intel_drv.h to linux/kernel.h and then removes
the other similar implementations of the same code in
drivers/clk/bcm/clk-kona.h, drivers/cpuidle/governors/menu.c and
drivers/media/dvb-frontends/cxd2820r_priv.h in favor of the one in
kernel.h

Javi Merino (4):
  kernel.h: Implement DIV_ROUND_CLOSEST_ULL
  clk: bcm/kona: use DIV_ROUND_CLOSEST_ULL()
  cpuidle: menu: use DIV_ROUND_CLOSEST_ULL()
  media: cxd2820r: use DIV_ROUND_CLOSEST_ULL()

 drivers/clk/bcm/clk-kona.c  | 28 +++-
 drivers/clk/bcm/clk-kona.h  |  1 -
 drivers/cpuidle/governors/menu.c|  8 +---
 drivers/gpu/drm/i915/intel_drv.h|  4 +---
 drivers/media/dvb-frontends/cxd2820r_c.c|  2 +-
 drivers/media/dvb-frontends/cxd2820r_core.c |  6 --
 drivers/media/dvb-frontends/cxd2820r_priv.h |  2 --
 drivers/media/dvb-frontends/cxd2820r_t.c|  2 +-
 drivers/media/dvb-frontends/cxd2820r_t2.c   |  2 +-
 include/linux/kernel.h  | 11 +++
 10 files changed, 23 insertions(+), 43 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] clk: bcm/kona: use DIV_ROUND_CLOSEST_ULL()

2015-03-20 Thread Javi Merino

Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
implementation and use the kernel one.

Cc: Mike Turquette 
Cc: Stephen Boyd 
Cc: Alex Elder 
Signed-off-by: Javi Merino 
---
I've only compile-tested this, I don't have the hardware to test it.

 drivers/clk/bcm/clk-kona.c | 28 +++-
 drivers/clk/bcm/clk-kona.h |  1 -
 2 files changed, 7 insertions(+), 22 deletions(-)

diff --git a/drivers/clk/bcm/clk-kona.c b/drivers/clk/bcm/clk-kona.c
index 05abae89262e..a0ef4f75d457 100644
--- a/drivers/clk/bcm/clk-kona.c
+++ b/drivers/clk/bcm/clk-kona.c
@@ -15,6 +15,7 @@
 #include "clk-kona.h"
 
 #include 
+#include 
 
 /*
  * "Policies" affect the frequencies of bus clocks provided by a
@@ -51,21 +52,6 @@ static inline u32 bitfield_replace(u32 reg_val, u32 shift, 
u32 width, u32 val)
 
 /* Divider and scaling helpers */
 
-/*
- * Implement DIV_ROUND_CLOSEST() for 64-bit dividend and both values
- * unsigned.  Note that unlike do_div(), the remainder is discarded
- * and the return value is the quotient (not the remainder).
- */
-u64 do_div_round_closest(u64 dividend, unsigned long divisor)
-{
-   u64 result;
-
-   result = dividend + ((u64)divisor >> 1);
-   (void)do_div(result, divisor);
-
-   return result;
-}
-
 /* Convert a divider into the scaled divisor value it represents. */
 static inline u64 scaled_div_value(struct bcm_clk_div *div, u32 reg_div)
 {
@@ -87,7 +73,7 @@ u64 scaled_div_build(struct bcm_clk_div *div, u32 div_value, 
u32 billionths)
combined = (u64)div_value * BILLION + billionths;
combined <<= div->u.s.frac_width;
 
-   return do_div_round_closest(combined, BILLION);
+   return DIV_ROUND_CLOSEST_ULL(combined, BILLION);
 }
 
 /* The scaled minimum divisor representable by a divider */
@@ -731,7 +717,7 @@ static unsigned long clk_recalc_rate(struct ccu_data *ccu,
scaled_rate = scale_rate(pre_div, parent_rate);
scaled_rate = scale_rate(div, scaled_rate);
scaled_div = divider_read_scaled(ccu, pre_div);
-   scaled_parent_rate = do_div_round_closest(scaled_rate,
+   scaled_parent_rate = DIV_ROUND_CLOSEST_ULL(scaled_rate,
scaled_div);
} else  {
scaled_parent_rate = scale_rate(div, parent_rate);
@@ -743,7 +729,7 @@ static unsigned long clk_recalc_rate(struct ccu_data *ccu,
 * rate.
 */
scaled_div = divider_read_scaled(ccu, div);
-   result = do_div_round_closest(scaled_parent_rate, scaled_div);
+   result = DIV_ROUND_CLOSEST_ULL(scaled_parent_rate, scaled_div);
 
return (unsigned long)result;
 }
@@ -790,7 +776,7 @@ static long round_rate(struct ccu_data *ccu, struct 
bcm_clk_div *div,
scaled_rate = scale_rate(pre_div, parent_rate);
scaled_rate = scale_rate(div, scaled_rate);
scaled_pre_div = divider_read_scaled(ccu, pre_div);
-   scaled_parent_rate = do_div_round_closest(scaled_rate,
+   scaled_parent_rate = DIV_ROUND_CLOSEST_ULL(scaled_rate,
scaled_pre_div);
} else {
scaled_parent_rate = scale_rate(div, parent_rate);
@@ -802,7 +788,7 @@ static long round_rate(struct ccu_data *ccu, struct 
bcm_clk_div *div,
 * the best we can do.
 */
if (!divider_is_fixed(div)) {
-   best_scaled_div = do_div_round_closest(scaled_parent_rate,
+   best_scaled_div = DIV_ROUND_CLOSEST_ULL(scaled_parent_rate,
rate);
min_scaled_div = scaled_div_min(div);
max_scaled_div = scaled_div_max(div);
@@ -815,7 +801,7 @@ static long round_rate(struct ccu_data *ccu, struct 
bcm_clk_div *div,
}
 
/* OK, figure out the resulting rate */
-   result = do_div_round_closest(scaled_parent_rate, best_scaled_div);
+   result = DIV_ROUND_CLOSEST_ULL(scaled_parent_rate, best_scaled_div);
 
if (scaled_div)
*scaled_div = best_scaled_div;
diff --git a/drivers/clk/bcm/clk-kona.h b/drivers/clk/bcm/clk-kona.h
index 2537b3072910..6849a64baf6d 100644
--- a/drivers/clk/bcm/clk-kona.h
+++ b/drivers/clk/bcm/clk-kona.h
@@ -503,7 +503,6 @@ extern struct clk_ops kona_peri_clk_ops;
 
 /* Externally visible functions */
 
-extern u64 do_div_round_closest(u64 dividend, unsigned long divisor);
 extern u64 scaled_div_max(struct bcm_clk_div *div);
 extern u64 scaled_div_build(struct bcm_clk_div *div, u32 div_value,
u32 billionths);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] media: cxd2820r: use DIV_ROUND_CLOSEST_ULL()

2015-03-20 Thread Javi Merino

Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
implementation and use the kernel one.

Cc: Antti Palosaari 
Cc: Mauro Carvalho Chehab 
Signed-off-by: Javi Merino 
---
I've only compile-tested it, I don't have the hardware to run it.

 drivers/media/dvb-frontends/cxd2820r_c.c| 2 +-
 drivers/media/dvb-frontends/cxd2820r_core.c | 6 --
 drivers/media/dvb-frontends/cxd2820r_priv.h | 2 --
 drivers/media/dvb-frontends/cxd2820r_t.c| 2 +-
 drivers/media/dvb-frontends/cxd2820r_t2.c   | 2 +-
 5 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/media/dvb-frontends/cxd2820r_c.c 
b/drivers/media/dvb-frontends/cxd2820r_c.c
index 149fdca3fb44..72b0e2db3aab 100644
--- a/drivers/media/dvb-frontends/cxd2820r_c.c
+++ b/drivers/media/dvb-frontends/cxd2820r_c.c
@@ -79,7 +79,7 @@ int cxd2820r_set_frontend_c(struct dvb_frontend *fe)
 
num = if_freq / 1000; /* Hz => kHz */
num *= 0x4000;
-   if_ctl = 0x4000 - cxd2820r_div_u64_round_closest(num, 41000);
+   if_ctl = 0x4000 - DIV_ROUND_CLOSEST_ULL(num, 41000);
buf[0] = (if_ctl >> 8) & 0x3f;
buf[1] = (if_ctl >> 0) & 0xff;
 
diff --git a/drivers/media/dvb-frontends/cxd2820r_core.c 
b/drivers/media/dvb-frontends/cxd2820r_core.c
index 422e84bbb008..490e090048ef 100644
--- a/drivers/media/dvb-frontends/cxd2820r_core.c
+++ b/drivers/media/dvb-frontends/cxd2820r_core.c
@@ -244,12 +244,6 @@ error:
return ret;
 }
 
-/* 64 bit div with round closest, like DIV_ROUND_CLOSEST but 64 bit */
-u32 cxd2820r_div_u64_round_closest(u64 dividend, u32 divisor)
-{
-   return div_u64(dividend + (divisor / 2), divisor);
-}
-
 static int cxd2820r_set_frontend(struct dvb_frontend *fe)
 {
struct cxd2820r_priv *priv = fe->demodulator_priv;
diff --git a/drivers/media/dvb-frontends/cxd2820r_priv.h 
b/drivers/media/dvb-frontends/cxd2820r_priv.h
index 7ff5f60c83e1..4b428959b16e 100644
--- a/drivers/media/dvb-frontends/cxd2820r_priv.h
+++ b/drivers/media/dvb-frontends/cxd2820r_priv.h
@@ -64,8 +64,6 @@ int cxd2820r_wr_reg_mask(struct cxd2820r_priv *priv, u32 reg, 
u8 val,
 int cxd2820r_wr_regs(struct cxd2820r_priv *priv, u32 reginfo, u8 *val,
int len);
 
-u32 cxd2820r_div_u64_round_closest(u64 dividend, u32 divisor);
-
 int cxd2820r_wr_regs(struct cxd2820r_priv *priv, u32 reginfo, u8 *val,
int len);
 
diff --git a/drivers/media/dvb-frontends/cxd2820r_t.c 
b/drivers/media/dvb-frontends/cxd2820r_t.c
index 51401d036530..008cb2ac8480 100644
--- a/drivers/media/dvb-frontends/cxd2820r_t.c
+++ b/drivers/media/dvb-frontends/cxd2820r_t.c
@@ -103,7 +103,7 @@ int cxd2820r_set_frontend_t(struct dvb_frontend *fe)
 
num = if_freq / 1000; /* Hz => kHz */
num *= 0x100;
-   if_ctl = cxd2820r_div_u64_round_closest(num, 41000);
+   if_ctl = DIV_ROUND_CLOSEST_ULL(num, 41000);
buf[0] = ((if_ctl >> 16) & 0xff);
buf[1] = ((if_ctl >>  8) & 0xff);
buf[2] = ((if_ctl >>  0) & 0xff);
diff --git a/drivers/media/dvb-frontends/cxd2820r_t2.c 
b/drivers/media/dvb-frontends/cxd2820r_t2.c
index 9c0c4f42175c..35fe364c7182 100644
--- a/drivers/media/dvb-frontends/cxd2820r_t2.c
+++ b/drivers/media/dvb-frontends/cxd2820r_t2.c
@@ -120,7 +120,7 @@ int cxd2820r_set_frontend_t2(struct dvb_frontend *fe)
 
num = if_freq / 1000; /* Hz => kHz */
num *= 0x100;
-   if_ctl = cxd2820r_div_u64_round_closest(num, 41000);
+   if_ctl = DIV_ROUND_CLOSEST_ULL(num, 41000);
buf[0] = ((if_ctl >> 16) & 0xff);
buf[1] = ((if_ctl >>  8) & 0xff);
buf[2] = ((if_ctl >>  0) & 0xff);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/4] cpuidle: menu: use DIV_ROUND_CLOSEST_ULL()

2015-03-20 Thread Javi Merino

Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
implementation and use the kernel one.

Cc: "Rafael J. Wysocki" 
Cc: Mel Gorman 
Cc: Stephen Hemminger 
Signed-off-by: Javi Merino 
---
 drivers/cpuidle/governors/menu.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 40580794e23d..b8a5fa15ca24 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -190,12 +190,6 @@ static DEFINE_PER_CPU(struct menu_device, menu_devices);
 
 static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device 
*dev);
 
-/* This implements DIV_ROUND_CLOSEST but avoids 64 bit division */
-static u64 div_round64(u64 dividend, u32 divisor)
-{
-   return div_u64(dividend + (divisor / 2), divisor);
-}
-
 /*
  * Try detecting repeating patterns by keeping track of the last 8
  * intervals, and checking if the standard deviation of that set
@@ -317,7 +311,7 @@ static int menu_select(struct cpuidle_driver *drv, struct 
cpuidle_device *dev)
 * operands are 32 bits.
 * Make sure to round up for half microseconds.
 */
-   data->predicted_us = div_round64((uint64_t)data->next_timer_us *
+   data->predicted_us = 
DIV_ROUND_CLOSEST_ULL((uint64_t)data->next_timer_us *
 data->correction_factor[data->bucket],
 RESOLUTION * DECAY);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] kernel.h: Implement DIV_ROUND_CLOSEST_ULL

2015-03-20 Thread Javi Merino

We have grown a number of different implementations of
DIV_ROUND_CLOSEST_ULL throughout the kernel.  Move the i915 one to
kernel.h so that it can be reused.

Cc: Daniel Vetter 
Cc: Jani Nikula 
Cc: David Airlie 
Cc: Darrick J. Wong 
Cc: Guenter Roeck 
Cc: Andrew Morton 
Signed-off-by: Javi Merino 
---
 drivers/gpu/drm/i915/intel_drv.h |  4 +---
 include/linux/kernel.h   | 11 +++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index eef79ccd0b7c..346e28fdd7dd 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "i915_drv.h"
 #include 
@@ -36,9 +37,6 @@
 #include 
 #include 
 
-#define DIV_ROUND_CLOSEST_ULL(ll, d)   \
-({ unsigned long long _tmp = (ll)+(d)/2; do_div(_tmp, d); _tmp; })
-
 /**
  * _wait_for - magic (register) wait macro
  *
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d6d630d31ef3..f7d744e9d275 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -103,6 +103,17 @@
(((__x) - ((__d) / 2)) / (__d));\
 }  \
 )
+/*
+ * Same as above but for u64 dividends.  divisor must be a 32-bit
+ * number.
+ */
+#define DIV_ROUND_CLOSEST_ULL(x, divisor)( \
+{  \
+   unsigned long long _tmp = (x) + (divisor) / 2;  \
+   do_div(_tmp, divisor);  \
+   _tmp;   \
+}  \
+)
 
 /*
  * Multiplies an integer by a fraction, while avoiding unnecessary
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] media: cxd2820r: use DIV_ROUND_CLOSEST_ULL()

2015-03-20 Thread Javi Merino

On Fri, Mar 20, 2015 at 01:51:36PM +, Alex Elder wrote:
> On 03/20/2015 06:14 AM, Javi Merino wrote:
> > Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
> > implementation and use the kernel one.
> > 
> > Cc: Antti Palosaari 
> > Cc: Mauro Carvalho Chehab 
> > Signed-off-by: Javi Merino 
> > ---
> > I've only compile-tested it, I don't have the hardware to run it.
> > 
> >  drivers/media/dvb-frontends/cxd2820r_c.c| 2 +-
> >  drivers/media/dvb-frontends/cxd2820r_core.c | 6 --
> >  drivers/media/dvb-frontends/cxd2820r_priv.h | 2 --
> >  drivers/media/dvb-frontends/cxd2820r_t.c| 2 +-
> >  drivers/media/dvb-frontends/cxd2820r_t2.c   | 2 +-
> >  5 files changed, 3 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/media/dvb-frontends/cxd2820r_c.c 
> > b/drivers/media/dvb-frontends/cxd2820r_c.c
> > index 149fdca3fb44..72b0e2db3aab 100644
> > --- a/drivers/media/dvb-frontends/cxd2820r_c.c
> > +++ b/drivers/media/dvb-frontends/cxd2820r_c.c
> > @@ -79,7 +79,7 @@ int cxd2820r_set_frontend_c(struct dvb_frontend *fe)
> >  
> > num = if_freq / 1000; /* Hz => kHz */
> > num *= 0x4000;
> > -   if_ctl = 0x4000 - cxd2820r_div_u64_round_closest(num, 41000);
> > +   if_ctl = 0x4000 - DIV_ROUND_CLOSEST_ULL(num, 41000);
> > buf[0] = (if_ctl >> 8) & 0x3f;
> > buf[1] = (if_ctl >> 0) & 0xff;
> >  
> > diff --git a/drivers/media/dvb-frontends/cxd2820r_core.c 
> > b/drivers/media/dvb-frontends/cxd2820r_core.c
> > index 422e84bbb008..490e090048ef 100644
> > --- a/drivers/media/dvb-frontends/cxd2820r_core.c
> > +++ b/drivers/media/dvb-frontends/cxd2820r_core.c
> > @@ -244,12 +244,6 @@ error:
> > return ret;
> >  }
> >  
> > -/* 64 bit div with round closest, like DIV_ROUND_CLOSEST but 64 bit */
> > -u32 cxd2820r_div_u64_round_closest(u64 dividend, u32 divisor)
> > -{
> > -   return div_u64(dividend + (divisor / 2), divisor);
> > -}
> 
> Technically, I'd say this has a bug, because the result
> needs to be 64 bits wide or your results might be much
> different from what might be desired.
> 
> Practically though, I'm pretty sure all callers provide
> values that ensure the result is valid.

All the callers are substituted in this patch so we can make sure that
they are all correct.

> I only call attention because this patch changes the return
> type of the function that gets called to do the calculation.

I'm not sure I follow.  Do you mean that this:

if_ctl = 0x4000 - DIV_ROUND_CLOSEST_ULL(num, 41000);

Should actually be:

if_ctl = 0x4000 - (u32)DIV_ROUND_CLOSEST_ULL(num, 41000);

?

if_ctl is a u16 so I don't think you gain anything by doing that.

> > -
> >  static int cxd2820r_set_frontend(struct dvb_frontend *fe)
> >  {
> > struct cxd2820r_priv *priv = fe->demodulator_priv;
> > diff --git a/drivers/media/dvb-frontends/cxd2820r_priv.h 
> > b/drivers/media/dvb-frontends/cxd2820r_priv.h
> > index 7ff5f60c83e1..4b428959b16e 100644
> > --- a/drivers/media/dvb-frontends/cxd2820r_priv.h
> > +++ b/drivers/media/dvb-frontends/cxd2820r_priv.h
> > @@ -64,8 +64,6 @@ int cxd2820r_wr_reg_mask(struct cxd2820r_priv *priv, u32 
> > reg, u8 val,
> >  int cxd2820r_wr_regs(struct cxd2820r_priv *priv, u32 reginfo, u8 *val,
> > int len);
> >  
> > -u32 cxd2820r_div_u64_round_closest(u64 dividend, u32 divisor);
> > -
> >  int cxd2820r_wr_regs(struct cxd2820r_priv *priv, u32 reginfo, u8 *val,
> > int len);
> >  
> > diff --git a/drivers/media/dvb-frontends/cxd2820r_t.c 
> > b/drivers/media/dvb-frontends/cxd2820r_t.c
> > index 51401d036530..008cb2ac8480 100644
> > --- a/drivers/media/dvb-frontends/cxd2820r_t.c
> > +++ b/drivers/media/dvb-frontends/cxd2820r_t.c
> > @@ -103,7 +103,7 @@ int cxd2820r_set_frontend_t(struct dvb_frontend *fe)
> >  
> > num = if_freq / 1000; /* Hz => kHz */
> > num *= 0x100;
> > -   if_ctl = cxd2820r_div_u64_round_closest(num, 41000);
> > +   if_ctl = DIV_ROUND_CLOSEST_ULL(num, 41000);

if_ctl is a u32, so you get the same behavior that you were getting
before: the downcasting of u64 to u32 happened in
cxd2820r_div_u64_round_closest(), now it happens here.

> > buf[0] = ((if_ctl >> 16) & 0xff);
> > buf[1] = ((if_ctl >>  8) & 0xff);
> > buf[2] = ((if_ctl >>  0) & 0xff);
> > diff --git a/drivers/media/dvb-frontends/cxd2820r_t2.c 
> > b/drivers/media/dvb-frontends/cxd2820r_t2.c
> > index 9c0c4f42175c..35fe364c7182

[RESEND PATCH v8 0/2] Add array printing support to libtraceevent

2015-03-20 Thread Javi Merino

This series add support to libtraceevent for dynamic arrays in traces.
The kernel learned to create this traces in 6ea22486ba46 ("tracing: Add
array printing helper"), which was merged for v4.0-rc1.

Changes since v7[0]:
  - Call the fields of the struct print_arg_int_array "field", "count"
and "el_size" to match the definition of __print_array as Namhyung
Kim suggests.  Incorporate his Acked-by.

Changes since v6[1]:
  - s/alloc_and_process_arg/alloc_and_process_delim/ as Steven
Rostedt suggests

[0] http://thread.gmane.org/gmane.linux.kernel/1897935
[1] http://thread.gmane.org/gmane.linux.kernel/1896232

Javi Merino (2):
  tools lib traceevent: factor out allocating and processing args
  tools lib traceevent: Add support for __print_array()

 tools/lib/traceevent/event-parse.c | 158 +
 tools/lib/traceevent/event-parse.h |   8 ++
 2 files changed, 135 insertions(+), 31 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v8 1/2] tools lib traceevent: factor out allocating and processing args

2015-03-20 Thread Javi Merino

The sequence of allocating the print_arg field, calling process_arg()
and verifying that the next event delimiter is repeated twice in
process_hex() and will also be used for process_int_array().  Factor it
out to a function to avoid writing the same code again and again.

Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Acked-by: Steven Rostedt 
Acked-by: Namhyung Kim 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 77 --
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index afe20ed9fac8..ac20601257de 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -2014,6 +2014,38 @@ process_entry(struct event_format *event __maybe_unused, 
struct print_arg *arg,
return EVENT_ERROR;
 }
 
+static int alloc_and_process_delim(struct event_format *event, char 
*next_token,
+  struct print_arg **print_arg)
+{
+   struct print_arg *field;
+   enum event_type type;
+   char *token;
+   int ret = 0;
+
+   field = alloc_arg();
+   if (!field) {
+   do_warning_event(event, "%s: not enough memory!", __func__);
+   errno = ENOMEM;
+   return -1;
+   }
+
+   type = process_arg(event, field, &token);
+
+   if (test_type_token(type, token, EVENT_DELIM, next_token)) {
+   errno = EINVAL;
+   ret = -1;
+   free_arg(field);
+   goto out_free_token;
+   }
+
+   *print_arg = field;
+
+out_free_token:
+   free_token(token);
+
+   return ret;
+}
+
 static char *arg_eval (struct print_arg *arg);
 
 static unsigned long long
@@ -2486,49 +2518,20 @@ out_free:
 static enum event_type
 process_hex(struct event_format *event, struct print_arg *arg, char **tok)
 {
-   struct print_arg *field;
-   enum event_type type;
-   char *token = NULL;
-
memset(arg, 0, sizeof(*arg));
arg->type = PRINT_HEX;
 
-   field = alloc_arg();
-   if (!field) {
-   do_warning_event(event, "%s: not enough memory!", __func__);
-   goto out_free;
-   }
-
-   type = process_arg(event, field, &token);
-
-   if (test_type_token(type, token, EVENT_DELIM, ","))
-   goto out_free;
-
-   arg->hex.field = field;
-
-   free_token(token);
-
-   field = alloc_arg();
-   if (!field) {
-   do_warning_event(event, "%s: not enough memory!", __func__);
-   *tok = NULL;
-   return EVENT_ERROR;
-   }
-
-   type = process_arg(event, field, &token);
-
-   if (test_type_token(type, token, EVENT_DELIM, ")"))
-   goto out_free;
+   if (alloc_and_process_delim(event, ",", &arg->hex.field))
+   goto out;
 
-   arg->hex.size = field;
+   if (alloc_and_process_delim(event, ")", &arg->hex.size))
+   goto free_field;
 
-   free_token(token);
-   type = read_token_item(tok);
-   return type;
+   return read_token_item(tok);
 
- out_free:
-   free_arg(field);
-   free_token(token);
+free_field:
+   free_arg(arg->hex.field);
+out:
*tok = NULL;
return EVENT_ERROR;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v8 2/2] tools lib traceevent: Add support for __print_array()

2015-03-20 Thread Javi Merino

Since 6ea22486ba46 ("tracing: Add array printing helper") trace can
traces with variable element size arrays.  Add support to parse them.

Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Acked-by: Steven Rostedt 
Acked-by: Namhyung Kim 
Signed-off-by: Javi Merino 
---
 tools/lib/traceevent/event-parse.c | 93 ++
 tools/lib/traceevent/event-parse.h |  8 
 2 files changed, 101 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index ac20601257de..838405ece41d 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -758,6 +758,11 @@ static void free_arg(struct print_arg *arg)
free_arg(arg->hex.field);
free_arg(arg->hex.size);
break;
+   case PRINT_INT_ARRAY:
+   free_arg(arg->int_array.field);
+   free_arg(arg->int_array.count);
+   free_arg(arg->int_array.el_size);
+   break;
case PRINT_TYPE:
free(arg->typecast.type);
free_arg(arg->typecast.item);
@@ -2537,6 +2542,32 @@ out:
 }
 
 static enum event_type
+process_int_array(struct event_format *event, struct print_arg *arg, char 
**tok)
+{
+   memset(arg, 0, sizeof(*arg));
+   arg->type = PRINT_INT_ARRAY;
+
+   if (alloc_and_process_delim(event, ",", &arg->int_array.field))
+   goto out;
+
+   if (alloc_and_process_delim(event, ",", &arg->int_array.count))
+   goto free_field;
+
+   if (alloc_and_process_delim(event, ")", &arg->int_array.el_size))
+   goto free_size;
+
+   return read_token_item(tok);
+
+free_size:
+   free_arg(arg->int_array.count);
+free_field:
+   free_arg(arg->int_array.field);
+out:
+   *tok = NULL;
+   return EVENT_ERROR;
+}
+
+static enum event_type
 process_dynamic_array(struct event_format *event, struct print_arg *arg, char 
**tok)
 {
struct format_field *field;
@@ -2831,6 +2862,10 @@ process_function(struct event_format *event, struct 
print_arg *arg,
free_token(token);
return process_hex(event, arg, tok);
}
+   if (strcmp(token, "__print_array") == 0) {
+   free_token(token);
+   return process_int_array(event, arg, tok);
+   }
if (strcmp(token, "__get_str") == 0) {
free_token(token);
return process_str(event, arg, tok);
@@ -3359,6 +3394,7 @@ eval_num_arg(void *data, int size, struct event_format 
*event, struct print_arg
break;
case PRINT_FLAGS:
case PRINT_SYMBOL:
+   case PRINT_INT_ARRAY:
case PRINT_HEX:
break;
case PRINT_TYPE:
@@ -3769,6 +3805,54 @@ static void print_str_arg(struct trace_seq *s, void 
*data, int size,
}
break;
 
+   case PRINT_INT_ARRAY: {
+   void *num;
+   int el_size;
+
+   if (arg->int_array.field->type == PRINT_DYNAMIC_ARRAY) {
+   unsigned long offset;
+   struct format_field *field =
+   arg->int_array.field->dynarray.field;
+   offset = pevent_read_number(pevent,
+   data + field->offset,
+   field->size);
+   num = data + (offset & 0x);
+   } else {
+   field = arg->int_array.field->field.field;
+   if (!field) {
+   str = arg->int_array.field->field.name;
+   field = pevent_find_any_field(event, str);
+   if (!field)
+   goto out_warning_field;
+   arg->int_array.field->field.field = field;
+   }
+   num = data + field->offset;
+   }
+   len = eval_num_arg(data, size, event, arg->int_array.count);
+   el_size = eval_num_arg(data, size, event,
+  arg->int_array.el_size);
+   for (i = 0; i < len; i++) {
+   if (i)
+   trace_seq_putc(s, ' ');
+
+   if (el_size == 1) {
+   trace_seq_printf(s, "%u", *(uint8_t *)num);
+   } else if (el_size == 2) {
+   trace_seq_printf(s, "%u", *(uint16_t *)num);
+   } else if (el_size == 4) {
+   trace_seq_printf(s, "%u", *(uint32_t *)num);
+   } els

Re: [PATCH 1/4] kernel.h: Implement DIV_ROUND_CLOSEST_ULL

2015-03-20 Thread Javi Merino

On Fri, Mar 20, 2015 at 06:19:26PM +, Emil Velikov wrote:
> On 20 March 2015 at 11:14, Javi Merino  wrote:
> > We have grown a number of different implementations of
> > DIV_ROUND_CLOSEST_ULL throughout the kernel.  Move the i915 one to
> > kernel.h so that it can be reused.
> >
> > Cc: Daniel Vetter 
> > Cc: Jani Nikula 
> > Cc: David Airlie 
> > Cc: Darrick J. Wong 
> > Cc: Guenter Roeck 
> > Cc: Andrew Morton 
> > Signed-off-by: Javi Merino 
> > ---
> >  drivers/gpu/drm/i915/intel_drv.h |  4 +---
> >  include/linux/kernel.h   | 11 +++
> >  2 files changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> > b/drivers/gpu/drm/i915/intel_drv.h
> > index eef79ccd0b7c..346e28fdd7dd 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -28,6 +28,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> Hi Javi,
> 
> Small suggestion - can we include the header only where needed ?
> i915/intel_panel.c seems to be the only user of DIV_ROUND_CLOSEST
> which will need an update.
> 
> Somewhat trivial pick but it will prevent ~40 unnecessary dives in kernel.h.

Sure, I'll fix it in the next version of the series.

Cheers,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL 0/4] perf/urgent fixes

2015-04-24 Thread Javi Merino

On Fri, Apr 24, 2015 at 03:02:18AM +0100, Namhyung Kim wrote:
> Hi Arnaldo,
> 
> I've set up some docker containers for build test, and found a couple
> of failures..  It seems David's kmem build fix ("perf kmem: Fix
> compiles on RHEL6/OL6") which is in your perf/core branch also needs
> to be in perf/urgent.  Sorry about the kmem breakages..
> 
> And I also found this..
> 
> 
> From 581ae7f48c89377755391c3f95637a1d48eefc73 Mon Sep 17 00:00:00 2001
> From: Namhyung Kim 
> Date: Fri, 24 Apr 2015 10:45:16 +0900
> Subject: [PATCH] tools lib traceevent: Fix build failure on 32-bit arch
> 
> In my i386 build, it failed like this:
> 
> CC   event-parse.o
>   event-parse.c: In function 'print_str_arg':
>   event-parse.c:3868:5: warning: format '%lu' expects argument of type 'long 
> unsigned int',
> but argument 3 has type 'uint64_t' [-Wformat]
> 
> Cc: Javi Merino 
> Signed-off-by: Namhyung Kim 
> ---
>  tools/lib/traceevent/event-parse.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/lib/traceevent/event-parse.c 
> b/tools/lib/traceevent/event-parse.c
> index 12a7e2a40c89..aa21bd55bd8a 100644
> --- a/tools/lib/traceevent/event-parse.c
> +++ b/tools/lib/traceevent/event-parse.c
> @@ -3865,7 +3865,7 @@ static void print_str_arg(struct trace_seq *s, void 
> *data, int size,
>   } else if (el_size == 4) {
>   trace_seq_printf(s, "%u", *(uint32_t *)num);
>   } else if (el_size == 8) {
> -     trace_seq_printf(s, "%lu", *(uint64_t *)num);
> + trace_seq_printf(s, "%"PRIu64, *(uint64_t 
> *)num);

Didn't know about PRIu64 and friends.  FWIW,

Acked-by: Javi Merino 

While you are at it, you could also fix the previous "%u" to "%"PRIu32

>   } else {
>   trace_seq_printf(s, "BAD SIZE:%d 0x%x",
>el_size, *(uint8_t *)num);
> -- 
> 2.3.4
> 
> 
> Thanks,
> Namhyung
> 
> 
> On Thu, Apr 23, 2015 at 07:03:06PM -0300, Arnaldo Carvalho de Melo wrote:
> > Hi Ingo,
> > 
> > Please consider pulling,
> > 
> > - Arnaldo
> > 
> > The following changes since commit 0140e6141e4f1d4b15fb469e6912b0e71b7d1cc2:
> > 
> >   perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver (2015-04-22 
> > 08:29:19 +0200)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> > tags/perf-urgent-for-mingo
> > 
> > for you to fetch changes up to de28c15daf60e9625bece22f13a091fac8d05f1d:
> > 
> >   tools lib api: Undefine _FORTIFY_SOURCE before setting it (2015-04-23 
> > 17:08:23 -0300)
> > 
> > 
> > perf/urgent fixes:
> > 
> > User visible:
> > 
> > - Enable events when doing system wide 'trace' and starting a
> >   workload, e.g:
> > 
> ># trace -a sleep 1
> > 
> >   now it matches the pattern in 'record' and will show envents
> >   instead of sitting doing nothing while waiting for the started
> >   workload to finish (Arnaldo Carvalho de Melo)
> > 
> > - Disable and drain events when forked 'trace' workload ends
> >   making sure we notice the end of the workload instead of
> >   trying to keep up with the seemingly neverending flux of
> >   system wide events (Arnaldo Carvalho de Melo)
> > 
> > Infrastructure:
> > 
> > - Fix the build on 32-bit ARM by consistently use PRIu64 for printing u64
> >   values in 'perf kmem' (Will Deacon)
> > 
> > - Undefine _FORTIFY_SOURCE before setting it in tools/perf/api, fixing the 
> > build on
> >   Hardened Gentoo systems (Bobby Powers)
> > 
> > Signed-off-by: Arnaldo Carvalho de Melo 
> > 
> > 
> > Arnaldo Carvalho de Melo (2):
> >   perf trace: Enable events when doing system wide tracing and starting 
> > a workload
> >   perf trace: Disable events and drain events when forked workload ends
> > 
> > Bobby Powers (1):
> >   tools lib api: Undefine _FORTIFY_SOURCE before setting it
> > 
> > Will Deacon (1):
> >   perf kmem: Consistently use PRIu64 for printing u64 values
> > 
> >  tools/lib/api/Makefile |  2 +-
> >  tools/perf/builtin-kmem.c  |  4 ++--
> >  tools/perf/builtin-trace.c | 10 --
> >  3 files changed, 11 insertions(+), 5 deletions(-)
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Thermal-SoC management updates for v4.1-rc1

2015-04-27 Thread Javi Merino

On Wed, Apr 15, 2015 at 06:48:20AM +0100, Eduardo Valentin wrote:
> Hello Rui,
> 
> Please pull from
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal 
> linus
> 
> to receive Thermal-SoC Management updates for v4.1-rc1 with top-most
> 
> 55920e072776533fd314fb3d9b69c866ed90b3df:
> 
>   thermal: exynos: Add the support for Exynos5433 TMU (2015-04-14 22:31:17 
> -0700)
> 
> on top of commit f8b3d8a5af7559a58613384cd23fc03a3c787acf:
> 
>   Merge tag 'usb-4.0-rc6' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb (2015-04-04 12:26:28 
> -0700)
> 
> Specifics:
> - Exynos thermal driver learns how to handle Exynos5433 TMU. Thanks to 
> Chanwoo C.;
> - Thermal Framework now supports QPNP PMIC temperature alarm as a new thermal
>   driver. Thanks to Ivan T. I.;
> - TI thermal driver now has a better implementation for EOCZ bit. Thanks to 
> Pavel M.;
> - Thermal Framework now has learned several new capabilities:
>   . use power estimates
>   . compute weights with relative integers instead of percentages
>   . allow governors to have private data in thermal zones
>   . export thermal zone parameters through sysfs
>   Thanks to the ARM thermal team (Javi M., Punit A., and KP).
> - Thermal Framework earns a new thermal governor: power allocator. First in 
> kernel
>   closed loop PI(D) controller for thermal control. Thanks to ARM thermal 
> team.
> - OF thermal now allows thermal zones to have sustainable power HW 
> specification.
>   Thanks to Punit.

4.1-rc1 is out and it looks like this fell through the cracks.  If
there isn't any objection, can any of you send the pull request to
Linus?

Thanks,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] tracing: make ftrace_print_array_seq compute buf_len

2015-04-29 Thread Javi Merino

On Wed, Apr 29, 2015 at 05:06:22PM +0100, Steven Rostedt wrote:
> On Wed, 29 Apr 2015 17:02:50 +0100
> Alex Bennée  wrote:
> 
> > 
> > Steven Rostedt  writes:
> > 
> > > On Wed, 29 Apr 2015 16:18:46 +0100
> > > Alex Bennée  wrote:
> > >
> > >> The only caller to this function (__print_array) was getting it wrong by
> > >> passing the array length instead of buffer length. As the element size
> > >> was already being passed for other reasons it seems reasonable to push
> > >> the calculation of buffer length into the function.
> > >> 
> > >> Signed-off-by: Alex Bennée 
> > >
> > > Thanks, I'll add a stable tag to this too, and get it out soon.
> > 
> > I take it you'll pick up Dave's reviewed-by?
> 
> Yep, I will.
> 
> > 
> > As for CC'ing stable I wouldn't worry too much as nothing in the kernel
> > uses __print_array yet (unless you count the example). But it is a
> > fairly trivial patch so if you as the maintainer is happy I'm happy ;-)
> 
> OK, if it's not used, then I'll just add it to this release.

Our maintainer missed the merge window (sigh) so the patches that were
going to use this will have to wait until linux v4.2.  So they will be
users in the future, but there's no need for this to go to stable.

Cheers,
Javi

Re: [PATCH v5 3/3] tools lib traceevent: Add support for __print_array()

2015-06-04 Thread Javi Merino

Hi Steve,

On Fri, Feb 27, 2015 at 02:15:05PM +, Steven Rostedt wrote:
> On Fri, 27 Feb 2015 12:32:32 +
> Javi Merino  wrote:
> > On Wed, Jan 28, 2015 at 12:48:55PM +, Javi Merino wrote:
> > > Trace can now generate traces with variable element size arrays.  Add
> > > support to parse them.
> > > 
> > > Cc: Namhyung Kim 
> > > Cc: Arnaldo Carvalho de Melo 
> > > Cc: Steven Rostedt 
> > > Cc: Jiri Olsa 
> > > Signed-off-by: Javi Merino 
> > > ---
> > >  tools/lib/traceevent/event-parse.c | 93 
> > > ++
> > >  tools/lib/traceevent/event-parse.h |  8 
> > >  2 files changed, 101 insertions(+)
> > 
> > I've seen that patch 1 of this series is now in mainline.  What about
> > patches 2 and 3 (the updates to tools/lib/traceevent)?  Shall I resend
> > them?
> 
> Patches 2 and 3 are in tools/lib and need to go through Jiri and
> Arnaldo. Please repost them again. I can give them acks.
> 
> > 
> > These two patches should also be applied to trace-cmd.  Do you want me
> > to send patches for that to linux-kernel or will you take care of
> > applying them there?
> 
> No need, I can pull them from here. I just been a bit busy to do so.

These two patches (b839e1e846ed ("tools lib traceevent: Add support
for __print_array()") and 929a6bb71aa5 ("tools lib traceevent: Factor
out allocating and processing args")) went in as of 4.1-rc1.  Any
estimate of when can we have a trace-cmd that have them?

Thanks,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND PATCH] MAINTAINERS: update thermal CPU cooling section

2021-04-10 Thread Javi Merino

On Fri, Apr 02, 2021 at 12:53:08PM +0200, Daniel Lezcano wrote:
> On 02/04/2021 12:25, Lukasz Luba wrote:
> > Hi Viresh, Daniel
> > 
> > On 2/18/21 4:18 AM, Viresh Kumar wrote:
> >> On 17-02-21, 11:59, Lukasz Luba wrote:
> >>> Update maintainers responsible for CPU cooling on Arm side.
> >>>
> >>> Signed-off-by: Lukasz Luba 
> >>> ---
> >>> Hi Daniel,
> >>>
> >>> Please ignore the previous email and that this change with 'R'.
> >>> Javi will ack it later.
> >>>
> >>> Regards,
> >>> Lukasz
> >>>
> >>>   MAINTAINERS | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/MAINTAINERS b/MAINTAINERS
> >>> index f32ebcff37d2..fe34f56acb0f 100644
> >>> --- a/MAINTAINERS
> >>> +++ b/MAINTAINERS
> >>> @@ -17774,7 +17774,7 @@ THERMAL/CPU_COOLING
> >>>   M:    Amit Daniel Kachhap 
> >>>   M:    Daniel Lezcano 
> >>>   M:    Viresh Kumar 
> >>> -M:    Javi Merino 
> >>> +R:    Lukasz Luba 
> >>>   L:    linux...@vger.kernel.org
> >>>   S:    Supported
> >>>   F:    Documentation/driver-api/thermal/cpu-cooling-api.rst
> >>
> >> Good that we have one more reviewer for this :)
> >>
> >> Acked-by: Viresh Kumar 
> >>
> > 
> > I believe it has lost somewhere in people mailboxes.
> > 
> > Thank you Viresh for the ACK.
> > 
> > Could you Daniel (or you Viresh) take this patch, please?
> 
> I was expecting Javi to ack it.

I did, but it looks like my replies never made it to the mailing
list.  Anyway, here it is:

Acked-by: Javi Merino 


signature.asc
Description: PGP signature

Re: [PATCH] thermal: tell cooling devices when a trip_point changes

2014-07-31 Thread Javi Merino

On Thu, Jul 31, 2014 at 12:10:40AM +0100, Matt Longnecker wrote:
> Some hardware can react autonomously at a programmed temperature.
> For example, an SoC might implement a last ditch throttle or a
> hardware thermal shutdown. The driver for such a device can
> register itself as a cooling_device with the thermal framework.
> 
> With this change, the thermal framework notifies such a driver
> when userspace alters the relevant trip temperature so that
> the driver can reprogram its hardware

Why can't you just use the existing cooling device interface?  Cooling
devices can be bound to trip points.  Most thermal governors will
increase cooling for that cooling device when the trip point is hit.
The last ditch throttle or hardware thermal shutdown will then kick
when the cooling state changes to 1.

If the existing governors are too complex for what you want, you can
have a look at the bang bang governor[0] which (I think) is bound to
be merged soon.

[0] http://article.gmane.org/gmane.linux.kernel/1753348

Cheers,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

checkpatch: false positives when parsing trace includes

2014-04-29 Thread Javi Merino

Hi,

checkpatch complains about the spaces before the close parenthesis in
trace events:

ERROR: space prohibited before that close parenthesis ')'
#94: FILE: include/trace/events/thermal.h:14:
+   __field(unsigned int,  freq  )

However, in that directory, that's actually the norm, not the
exception:

$ git grep '__field(' include/trace/events/ | grep -P '[ \t]+\)' | wc -l
1284
$ git grep '__field(' include/trace/events/ | wc -l
1783
$

More than 70% of the __field() entries *have* spaces before the close
parenthesis.  Should checkpatch make an exception for this directory
and not flag it as an error?

Cheers,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] thermal: document struct thermal_zone_device and thermal_governor

2014-05-16 Thread Javi Merino

Document struct struct thermal_zone_device and struct thermal_governor
fields and their use by the thermal framework code.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 

---

Hi linux-pm,

I have some patches that add new fields to these structures but I
don't have a good place to describe those fields as these structs are
mostly undocumented so I thought I'd document them.

I'm unsure about some of the descriptions, specially for passive and
forced_passive so please review them.

 include/linux/thermal.h |   44 ++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..af928c667dba 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,40 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Only used by the step-wise
+ * governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   step-wise specific parameter.  1 if you've crossed a passive
+ * trip point, 0 otherwise
+ * @forced_passive:step-wise specific parameter.  If > 0, temperature at
+ * which to switch on all cpufreq cooling devices.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +213,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 2/5] thermal: cpu_cooling: Add notifications support for the clients

2014-05-16 Thread Javi Merino

Hi Amit,

On Thu, May 08, 2014 at 03:37:57PM +0100, Amit Daniel Kachhap wrote:
> This patch adds notification support for those clients of cpu_cooling
> APIs which may want to do something interesting after receiving these
> cpu_cooling events. The notifier structure passed is of both Set/Get type.
> The notfications events can be of type,
> 1. CPU_COOLING_SET_STATE_PRE
> 2. CPU_COOLING_SET_STATE_POST
> 3. CPU_COOLING_GET_CUR_STATE
> 4. CPU_COOLING_GET_MAX_STATE
> 
> The advantages of these notfications is to differentiate between different
> P states in the cpufreq table and the cooling states. The clients of these
> events may group few P states into 1 cooling states. Also some more cooling
> states can be enabled when the maximum of P state is reached. Post 
> notifications
> can be used for those cases.
> 
> Signed-off-by: Amit Daniel Kachhap 
> ---
>  drivers/thermal/cpu_cooling.c |   99 +++-
>  include/linux/cpu_cooling.h   |   55 +++
>  2 files changed, 151 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 21f44d4..e2aeb36 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -50,6 +50,7 @@ struct cpufreq_cooling_device {
>   unsigned int cpufreq_state;
>   unsigned int cpufreq_val;
>   struct cpumask allowed_cpus;
> + struct cpufreq_cooling_status request_status;
>   struct list_head node;
>  };
>  static DEFINE_IDR(cpufreq_idr);
> @@ -59,6 +60,8 @@ static DEFINE_MUTEX(cooling_cpufreq_lock);
>  #define NOTIFY_INVALID NULL
>  static struct cpufreq_cooling_device *notify_device;
>  
> +/* Notfier list to validates/updates the cpufreq cooling states */
> +static BLOCKING_NOTIFIER_HEAD(cpufreq_cooling_state_notifier_list);
>  /* A list to hold all the cpufreq cooling devices registered */
>  static LIST_HEAD(cpufreq_cooling_list);
>  
> @@ -266,6 +269,21 @@ static unsigned int get_cpu_frequency(unsigned int cpu, 
> unsigned long level)
>   return freq;
>  }
>  
> +static int
> +cpufreq_cooling_notify_states(struct cpufreq_cooling_status *request,
> + enum cpu_cooling_state_ops op)
> +{
> + /* Invoke the notifiers which have registered for this state change */
> + if (op == CPU_COOLING_SET_STATE_PRE ||
> + op == CPU_COOLING_SET_STATE_POST ||
> + op == CPU_COOLING_GET_MAX_STATE ||
> + op == CPU_COOLING_GET_CUR_STATE) {
> + blocking_notifier_call_chain(
> + &cpufreq_cooling_state_notifier_list, op, request);
> + }
> + return 0;
> +}
> +
>  /**
>   * cpufreq_apply_cooling - function to apply frequency clipping.
>   * @cpufreq_device: cpufreq_cooling_device pointer containing frequency
> @@ -285,9 +303,18 @@ static int cpufreq_apply_cooling(struct 
> cpufreq_cooling_device *cpufreq_device,
>   struct cpumask *mask = &cpufreq_device->allowed_cpus;
>   unsigned int cpu = cpumask_any(mask);
>  
> + cpufreq_device->request_status.cur_state =
> + cpufreq_device->cpufreq_state;
> + cpufreq_device->request_status.new_state = cooling_state;
> +
> + cpufreq_cooling_notify_states(&cpufreq_device->request_status,
> + CPU_COOLING_SET_STATE_PRE);
> +
> + cooling_state = cpufreq_device->request_status.new_state;
>  
>   /* Check if the old cooling action is same as new cooling action */
> - if (cpufreq_device->cpufreq_state == cooling_state)
> + if (cpufreq_device->cpufreq_state ==
> + cpufreq_device->request_status.new_state)
>   return 0;
>  
>   clip_freq = get_cpu_frequency(cpu, cooling_state);
> @@ -304,7 +331,8 @@ static int cpufreq_apply_cooling(struct 
> cpufreq_cooling_device *cpufreq_device,
>   }
>  
>   notify_device = NOTIFY_INVALID;
> -
> + cpufreq_cooling_notify_states(&cpufreq_device->request_status,
> + CPU_COOLING_SET_STATE_POST);
>   return 0;
>  }
>  
> @@ -383,6 +411,11 @@ static int cpufreq_get_max_state(struct 
> thermal_cooling_device *cdev,
>   if (count > 0)
>   *state = count;
>  
> + cpufreq_device->request_status.max_state = count;
> + cpufreq_cooling_notify_states(&cpufreq_device->request_status,
> + CPU_COOLING_GET_MAX_STATE);
> + *state = cpufreq_device->request_status.max_state;
> +

I think this should all be inside the "if (count > 0)".  If not, then
remove it, as it is dead code now.

Cheers,
Javi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v2 3/7] thermal: let governors have private data for each thermal zone

2014-05-20 Thread Javi Merino

A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---
 drivers/thermal/thermal_core.c |   83 
 include/linux/thermal.h|9 +
 2 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 71b0ec0c370d..1b13d8e0cfd1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -72,6 +72,58 @@ static struct thermal_governor *__find_governor(const char 
*name)
return NULL;
 }
 
+/**
+ * bind_previous_governor - bind the previous governor of the thermal zone
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @failed_gov_name:   the name of the governor that failed to register
+ *
+ * Register the previous governor of the thermal zone after a new
+ * governor has failed to be bound.
+ */
+static void bind_previous_governor(struct thermal_zone_device *tz,
+   const char *failed_gov_name)
+{
+   if (tz->governor && tz->governor->bind_to_tz) {
+   if (tz->governor->bind_to_tz(tz)) {
+   dev_warn(&tz->device,
+   "governor %s failed to bind and the previous 
one (%s) failed to register again, thermal zone %s has no governor\n",
+   failed_gov_name, tz->governor->name, tz->type);
+   tz->governor = NULL;
+   }
+   }
+}
+
+/**
+ * thermal_set_governor() - Switch to another governor
+ * @tz:a valid pointer to a struct thermal_zone_device
+ * @new_gov:   pointer to the new governor
+ *
+ * Change the governor of thermal zone @tz.
+ *
+ * Returns 0 on success, an error if the new governor's bind_to_tz() failed.
+ */
+static int thermal_set_governor(struct thermal_zone_device *tz,
+   struct thermal_governor *new_gov)
+{
+   int ret = 0;
+
+   if (tz->governor && tz->governor->unbind_from_tz)
+   tz->governor->unbind_from_tz(tz);
+
+   if (new_gov && new_gov->bind_to_tz) {
+   ret = new_gov->bind_to_tz(tz);
+   if (ret) {
+   bind_previous_governor(tz, new_gov->name);
+
+   return ret;
+   }
+   }
+
+   tz->governor = new_gov;
+
+   return ret;
+}
+
 int thermal_register_governor(struct thermal_governor *governor)
 {
int err;
@@ -104,8 +156,15 @@ int thermal_register_governor(struct thermal_governor 
*governor)
 
name = pos->tzp->governor_name;
 
-   if (!strnicmp(name, governor->name, THERMAL_NAME_LENGTH))
-   pos->governor = governor;
+   if (!strnicmp(name, governor->name, THERMAL_NAME_LENGTH)) {
+   int ret;
+
+   ret = thermal_set_governor(pos, governor);
+   if (ret)
+   dev_warn(&pos->device,
+   "Failed to set governor %s for thermal 
zone %s: %d\n",
+   governor->name, pos->type, ret);
+   }
}
 
mutex_unlock(&thermal_list_lock);
@@ -131,7 +190,7 @@ void thermal_unregister_governor(struct thermal_governor 
*governor)
list_for_each_entry(pos, &thermal_tz_list, node) {
if (!strnicmp(pos->governor->name, governor->name,
THERMAL_NAME_LENGTH))
-   pos->governor = NULL;
+   thermal_set_governor(pos, NULL);
}
 
mutex_unlock(&thermal_list_lock);
@@ -756,8 +815,9 @@ policy_store(struct device *dev, struct device_attribute 
*attr,
if (!gov)
goto exit;
 
-   tz->governor = gov;
-   ret = count;
+   ret = thermal_set_governor(tz, gov);
+   if (!ret)
+   ret = count;
 
 exit:
mutex_unlock(&thermal_governor_lock);
@@ -1452,6 +1512,7 @@ struct thermal_zone_device 
*thermal_zone_device_register(const char *type,
int result;
int count;
int passive = 0;
+   struct thermal_governor *governor;
 
if (type && strlen(type) >= THERMAL_NAME_LENGTH)
return ERR_PTR(-EINVAL);
@@ -1542,9 +1603,15 @@ struct thermal_zo

[RFC PATCH v2 1/7] tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks

2014-05-20 Thread Javi Merino

From: "Steven Rostedt (Red Hat)" 

Being able to show a cpumask of events can be useful as some events
may affect only some CPUs. There is no standard way to record the
cpumask and converting it to a string is rather expensive during
the trace as traces happen in hotpaths. It would be better to record
the raw event mask and be able to parse it at print time.

The following macros were added for use with the TRACE_EVENT() macro:

  __bitmask()
  __assign_bitmask()
  __get_bitmask()

To test this, I added this to the sched_migrate_task event, which
looked like this:

TRACE_EVENT(sched_migrate_task,

TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask 
*cpus),

TP_ARGS(p, dest_cpu, cpus),

TP_STRUCT__entry(
__array(char,   comm,   TASK_COMM_LEN   )
__field(pid_t,  pid )
__field(int,prio)
__field(int,orig_cpu)
__field(int,dest_cpu)
__bitmask(  cpumask, num_possible_cpus())
),

TP_fast_assign(
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid= p->pid;
__entry->prio   = p->prio;
__entry->orig_cpu   = task_cpu(p);
__entry->dest_cpu   = dest_cpu;
__assign_bitmask(cpumask, cpumask_bits(cpus), 
num_possible_cpus());
),

TP_printk("comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s",
  __entry->comm, __entry->pid, __entry->prio,
  __entry->orig_cpu, __entry->dest_cpu,
  __get_bitmask(cpumask))
);

With the output of:

ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: 
comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=,000f
 migration/1-13[001] d..5   485.221202: sched_migrate_task: 
comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=,000f
 awk-3615  [002] d.H5   485.221747: sched_migrate_task: 
comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=,00ff
 migration/2-18[002] d..5   485.222062: sched_migrate_task: 
comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=,000f

Link: 
http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.mer...@arm.com
Link: http://lkml.kernel.org/r/20140506132238.22e13...@gandalf.local.home

Suggested-by: Javi Merino 
Tested-by: Javi Merino 
Signed-off-by: Steven Rostedt 
---
 include/linux/ftrace_event.h |3 +++
 include/linux/trace_seq.h|   10 
 include/trace/ftrace.h   |   57 +-
 kernel/trace/trace_output.c  |   41 ++
 4 files changed, 110 insertions(+), 1 deletion(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index d16da3e53bc7..cff3106ffe2c 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -38,6 +38,9 @@ const char *ftrace_print_symbols_seq_u64(struct trace_seq *p,
 *symbol_array);
 #endif
 
+const char *ftrace_print_bitmask_seq(struct trace_seq *p, void *bitmask_ptr,
+unsigned int bitmask_size);
+
 const char *ftrace_print_hex_seq(struct trace_seq *p,
 const unsigned char *buf, int len);
 
diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index a32d86ec8bf2..136116924d8d 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -46,6 +46,9 @@ extern int trace_seq_putmem_hex(struct trace_seq *s, const 
void *mem,
 extern void *trace_seq_reserve(struct trace_seq *s, size_t len);
 extern int trace_seq_path(struct trace_seq *s, const struct path *path);
 
+extern int trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
+int nmaskbits);
+
 #else /* CONFIG_TRACING */
 static inline int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
 {
@@ -57,6 +60,13 @@ trace_seq_bprintf(struct trace_seq *s, const char *fmt, 
const u32 *binary)
return 0;
 }
 
+static inline int
+trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
+ int nmaskbits)
+{
+   return 0;
+}
+
 static inline int trace_print_seq(struct seq_file *m, struct trace_seq *s)
 {
return 0;
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 0a1a4f7caf09..9b7a989dcbcc 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -53,6 +53,9 @@
 #undef __string
 #define __string(item, src) __dynamic_array(char, item, -1)
 
+#undef __bitmask
+#define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
+
 #undef TP_STRUCT__entry

[RFC PATCH v2 5/7] thermal: add a basic cpu power actor

2014-05-20 Thread Javi Merino

Introduce a power actor for cpus.  It has a basic power model to get
the current power utilization and uses cpufreq cooling devices to set
the desired power.  It uses the current frequency (as reported by
cpufreq) as well as load and OPPs for the power calculations.  The
cpus must have registered their OPPs in the OPP library.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Punit Agrawal 
Signed-off-by: Javi Merino 
---
 Documentation/thermal/power_actor.txt |   46 
 drivers/thermal/Kconfig   |5 +
 drivers/thermal/power_actor/Kconfig   |9 +
 drivers/thermal/power_actor/Makefile  |2 +
 drivers/thermal/power_actor/cpu_actor.c   |  419 +
 drivers/thermal/power_actor/power_actor.h |   23 ++
 6 files changed, 504 insertions(+)
 create mode 100644 drivers/thermal/power_actor/Kconfig
 create mode 100644 drivers/thermal/power_actor/cpu_actor.c

diff --git a/Documentation/thermal/power_actor.txt 
b/Documentation/thermal/power_actor.txt
index a0f06e091907..d74909376610 100644
--- a/Documentation/thermal/power_actor.txt
+++ b/Documentation/thermal/power_actor.txt
@@ -27,3 +27,49 @@ Callbacks
 milliwatts.
 
 Returns 0 on success, -E* on error.
+
+CPU Power Actor API
+===
+A simple power model for CPUs.  The current power is calculated as
+dynamic power.  The dynamic power consumption of a processor depends
+on many factors.  For a given processor implementation the primary
+factors are:
+
+- The time the processor spends running, consuming dynamic power, as
+  compared to the time in idle states where dynamic consumption is
+  negligible.  Herein we refer to this as 'utilisation'.
+- The voltage and frequency levels as a result of DVFS.  The DVFS
+  level is a dominant factor governing power consumption.
+- In running time the 'execution' behaviour (instruction types, memory
+  access patterns and so forth) causes, in most cases, a second order
+  variation.  In pathological cases this variation can be significant,
+  but typically it is of a much lesser impact than the factors above.
+
+A high level dynamic power consumption model may then be represented as:
+
+Pdyn = f(run) * Voltage^2 * Frequency * Utilisation
+
+f(run) here represents the described execution behaviour and its
+result has a units of Watts/Hz/Volt^2 (this often expressed in
+mW/MHz/uVolt^2)
+
+The detailed behaviour for f(run) could be modelled on-line.  However,
+in practice, such an on-line model has dependencies on a number of
+implementation specific processor support and characterisation
+factors.  Therefore, in initial implementation that contribution is
+represented as a constant coefficient.  This is a simplification
+consistent with the relative contribution to overall power variation.
+
+In this simplified representation our model becomes:
+
+Pdyn = Kd * Voltage^2 * Frequency * Utilisation
+
+Where Kd (capacitance) represents an indicative running time dynamic
+power coefficient in fundamental units of mW/MHz/uVolt^2
+
+This power model requires that the operating-points of the CPUs are
+registered using the kernel's opp library and the
+`cpufreq_frequency_table` is assigned to the `struct device` of the
+cpu.  If you are using the `cpufreq-cpu0.c` driver then the
+`cpufreq_frequency_table` should already be assigned to the cpu
+device.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 47e2f15537ca..1818c4fa60b8 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -92,6 +92,11 @@ config THERMAL_GOV_USER_SPACE
 config THERMAL_POWER_ACTOR
bool
 
+menu "Power actors"
+depends on THERMAL_POWER_ACTOR
+source "drivers/thermal/power_actor/Kconfig"
+endmenu
+
 config CPU_THERMAL
bool "generic cpu cooling support"
depends on CPU_FREQ
diff --git a/drivers/thermal/power_actor/Kconfig 
b/drivers/thermal/power_actor/Kconfig
new file mode 100644
index ..fa542ca99cdb
--- /dev/null
+++ b/drivers/thermal/power_actor/Kconfig
@@ -0,0 +1,9 @@
+#
+# Thermal power actor configuration
+#
+
+config THERMAL_POWER_ACTOR_CPU
+   bool
+   prompt "Simple power model for a CPU"
+   help
+ A simple CPU power model
diff --git a/drivers/thermal/power_actor/Makefile 
b/drivers/thermal/power_actor/Makefile
index 46478f4928be..6f04b92997e6 100644
--- a/drivers/thermal/power_actor/Makefile
+++ b/drivers/thermal/power_actor/Makefile
@@ -3,3 +3,5 @@
 #
 
 obj-y += power_actor.o
+
+obj-$(CONFIG_THERMAL_POWER_ACTOR_CPU) += cpu_actor.o
diff --git a/drivers/thermal/power_actor/cpu_actor.c 
b/drivers/thermal/power_actor/cpu_actor.c
new file mode 100644
index ..0d76d52609fa
--- /dev/null
+++ b/drivers/thermal/power_actor/cpu_actor.c
@@ -0,0 +1,419 @@
+/*
+ * A basic cpu actor
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Gener

[RFC PATCH v2 2/7] thermal: document struct thermal_zone_device and thermal_governor

2014-05-20 Thread Javi Merino

Document struct thermal_zone_device and struct thermal_governor fields
and their use by the thermal framework code.

Cc: Zhang Rui 
Cc: Eduardo Valentin 
Signed-off-by: Javi Merino 
---

Hi linux-pm,

This was sent as a separate patch to linux-pm and can be merged
independently, as it documents the current thermal framework.

 include/linux/thermal.h |   44 ++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f7e11c7ea7d9..9b7cb804e03f 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -158,6 +158,40 @@ struct thermal_attr {
char name[THERMAL_NAME_LENGTH];
 };
 
+/**
+ * struct thermal_zone_device - structure for a thermal zone
+ * @id:unique id number for each thermal zone
+ * @type:  the thermal zone device type
+ * @device:struct device for this thermal zone
+ * @trip_temp_attrs:   attributes for trip points for sysfs: trip temperature
+ * @trip_type_attrs:   attributes for trip points for sysfs: trip type
+ * @trip_hyst_attrs:   attributes for trip points for sysfs: trip hysteresis
+ * @devdata:   private pointer for device private data
+ * @trips: number of trip points the thermal zone supports
+ * @passive_delay: number of milliseconds to wait between polls when
+ * performing passive cooling.  Only used by the step-wise
+ * governor
+ * @polling_delay: number of milliseconds to wait between polls when
+ * checking whether trip points have been crossed (0 for
+ * interrupt driven systems)
+ * @temperature:   current temperature.  This is only for core code,
+ * drivers should use thermal_zone_get_temp() to get the
+ * current temperature
+ * @last_temperature:  previous temperature read
+ * @emul_temperature:  emulated temperature when using CONFIG_THERMAL_EMULATION
+ * @passive:   step-wise specific parameter.  1 if you've crossed a passive
+ * trip point, 0 otherwise
+ * @forced_passive:step-wise specific parameter.  If > 0, temperature at
+ * which to switch on all cpufreq cooling devices.
+ * @ops:   operations this thermal_zone_device supports
+ * @tzp:   thermal zone parameters
+ * @governor:  pointer to the governor for this thermal zone
+ * @thermal_instances: list of struct thermal_instance of this thermal zone
+ * @idr:   struct idr to generate unique id for this zone's cooling devices
+ * @lock:  lock to protect thermal_instances list
+ * @node:  node in thermal_tz_list (in thermal_core.c)
+ * @poll_queue:delayed work for polling
+ */
 struct thermal_zone_device {
int id;
char type[THERMAL_NAME_LENGTH];
@@ -179,12 +213,18 @@ struct thermal_zone_device {
struct thermal_governor *governor;
struct list_head thermal_instances;
struct idr idr;
-   struct mutex lock; /* protect thermal_instances list */
+   struct mutex lock;
struct list_head node;
struct delayed_work poll_queue;
 };
 
-/* Structure that holds thermal governor information */
+/**
+ * struct thermal_governor - structure that holds thermal governor information
+ * @name:  name of the governor
+ * @throttle:  callback called for every trip point even if temperature is
+ * below the trip point temperature
+ * @governor_list: node in thermal_governor_list (in thermal_core.c)
+ */
 struct thermal_governor {
char name[THERMAL_NAME_LENGTH];
int (*throttle)(struct thermal_zone_device *tz, int trip);
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 406 matches

Mail list logo