Sorry for being late Andreas :(

On 14-09-16, 16:56, Andreas Herrmann wrote:

First of all, thanks for your hardwork in getting these numbers out.
Really appreciate it.

> Below is some trace data. I hope it is of some help.
> 
> (A) - sampling 10s period when system is idle
> (B) - sampling 10s period when system partially loaded (kernel
>       compilation using 2 jobs)
> 
> (1) 4.8-rc5
> (2) 4.8-rc5 with my patch (reintro of deadband effect within
>     pcc-cpufreq)
> (3) 4.8-rc5 with reversal of 6393d6a102 (cpufreq: ondemand: Eliminate
>     the deadband effect)
> 
> Let me know whether you are looking for other trace data wrt this
> issue.
> 
> 
> Thanks,
> 
> Andreas
> 
> ---
> 
> (A)-(1)
> 
>  # Total Lost Samples: 0
>  # Samples: 41  of event 'power:cpu_frequency'
>  # Event count (approx.): 41
>  # Overhead  Command       Shared Object     Symbol                       
>  # ........  ............  ................  .............................
>      39.02%  kworker/14:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      29.27%  kworker/0:0   [kernel.vmlinux]  [k] cpufreq_notify_transition
>      19.51%  kworker/10:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       7.32%  kworker/5:2   [kernel.vmlinux]  [k] cpufreq_notify_transition
>       2.44%  kworker/23:2  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       2.44%  kworker/40:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
> 
> (A)-(2)
> 
>  # Total Lost Samples: 0
>  # Samples: 6  of event 'power:cpu_frequency'
>  # Event count (approx.): 6
>  # Overhead  Command       Shared Object     Symbol                       
>  # ........  ............  ................  .............................
>      33.33%  kworker/1:2   [kernel.vmlinux]  [k] cpufreq_notify_transition
>      16.67%  kworker/16:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      16.67%  kworker/22:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      16.67%  kworker/26:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      16.67%  kworker/33:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
> 
> (A)-(3)
> 
>  # Total Lost Samples: 0
>  # Samples: 7  of event 'power:cpu_frequency'
>  # Event count (approx.): 7
>  # Overhead  Command       Shared Object     Symbol                       
>  # ........  ............  ................  .............................
>      28.57%  kworker/58:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      14.29%  kworker/19:2  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      14.29%  kworker/20:2  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      14.29%  kworker/22:2  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      14.29%  kworker/23:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>      14.29%  kworker/35:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
> 
> ---
> 
> (B)-(1)
> 
>  # Total Lost Samples: 0
>  # Samples: 2K of event 'power:cpu_frequency'
>  # Event count (approx.): 2382
>  # Overhead  Command       Shared Object     Symbol                       
>  # ........  ............  ................  .............................
>       5.75%  kworker/0:0   [kernel.vmlinux]  [k] cpufreq_notify_transition
>       4.16%  kworker/12:2  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       3.11%  kworker/17:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       2.94%  kworker/2:1   [kernel.vmlinux]  [k] cpufreq_notify_transition
>       2.73%  kworker/19:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       ...
> 
> (B)-(2)
> 
>  # Total Lost Samples: 0
>  # Samples: 320  of event 'power:cpu_frequency'
>  # Event count (approx.): 320
>  # Overhead  Command       Shared Object     Symbol                       
>  # ........  ............  ................  .............................
>       4.69%  kworker/56:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       4.06%  kworker/12:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       4.06%  kworker/28:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       4.06%  kworker/6:2   [kernel.vmlinux]  [k] cpufreq_notify_transition
>       3.75%  kworker/32:2  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       ...
> 
> (B)-(3)
> 
>  # Total Lost Samples: 0
>  # Samples: 333  of event 'power:cpu_frequency'
>  # Event count (approx.): 333
>  # Overhead  Command       Shared Object     Symbol                       
>  # ........  ............  ................  .............................
>       4.80%  kworker/51:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       4.50%  kworker/39:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       4.20%  kworker/47:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       3.90%  kworker/59:1  [kernel.vmlinux]  [k] cpufreq_notify_transition
>       3.90%  kworker/7:2   [kernel.vmlinux]  [k] cpufreq_notify_transition
>       ...

I am worried by all of these.. So, its not the DVFS transition which
took time, but cpufreq_notify_transition(). And probably one of the
drivers in your setup is screwing up here, which has registered with
cpufreq_register_notifier().

Can you please take a look at that ? Just check which all routines are
getting called as part of srcu_notifier_call_chain().

Also a simple (hacky) solution to fix the problem you have got is to
divide the range of frequency in steps (which you have already done
AFAICT in one of your patches) and not mention the frequencies that
were part of the deadband earlier. That will keep the CPU at either
low freq or some higher freqs.

But I am quite sure that we have an abuser here and we better find it
now.

-- 
viresh

Reply via email to