On Sat, Apr 2, 2016 at 1:36 AM, Doug Smythies <dsmyth...@telus.net> wrote:
> On 2016.04.01 12:54 Rafael J. Wysocki wrote:
>>On Fri, Apr 1, 2016 at 8:31 PM, Doug Smythies <dsmyth...@telus.net> wrote:
>>> On 2106.034.01 10:45 Srinivas Pandruvada wrote:
>>>> On Fri, 2016-04-01 at 16:06 +0200, Jörg Otte wrote:
>>> > > > > >
>>>>> Done. Attached the tracer.
>>>>> For me it looks like the previous one of the failing case.
>>>>
>>>> The traces show that idle task is constantly running without sleep.
>>>
>>> No, they (at least the first one, I didn't look at the next one yet)
>>> show that CPUs 2 and 3 are spending around 99% of their time not in state
>>> C0.
>
>> How do you figure that out if I may ask?  It is not so obvious to me
>> to be honest.
>
> The trace was not in the form for the post processing tools, so I had
> to manually import the trace into a spreadsheet and manually add new columns
> calculated from the others.
>
> Load = mperf / tsc * 100 % = C0 time.
> Duration (mS) = tsc / 2.5e9 * 1000
> Note: I do not recall seeing an exact tsc for Jörg's computer, so I used
> The 2.5 GHz from the device spec from some earlier e-mail.
>
> Example (formatting will likely not send O.K.):
>
>                 CPU#    time            core_busy       scaled  from    to    
>   mperf           aperf           tsc             freq            load        
>     duration (ms)
> <idle>-0        [002]   465.879451:     100             96              26    
>   26      1826656 1826710 25062693        2500073 7.288%  10.025
> <idle>-0        [003]   465.879484:     99              96              26    
>   26      305796  305781  25147993        2499877 1.216%  10.059
> <idle>-0        [000]   465.885794:     100             96              26    
>   26      975908  975951  32434672        2500110 3.009%  12.974
> <idle>-0        [001]   465.886898:     100             250             10    
>   31      327356  327364  26673840        2500061 1.227%  10.670
> <idle>-0        [002]   465.889527:     100             96              26    
>   26      205336  205365  25133396        2500353 0.817%  10.053
> <idle>-0        [003]   465.889555:     99              95              26    
>   26      62544           62341           25117916        2491885 0.249%  
> 10.047

OK

It could be C1 with relatively short periods spent in it.

>> That the sample rate is ending up at ~10 Milliseconds, indicates some
>> high frequency (>= 100Hz) events on those CPUs. Those events, apparently,
>> take very little CPU time to complete, hence a load of about 1% on average.
>>
>> By the way, I can recreate the high sample rate with virtually no load
>> on my system easy, but so far have been unable to get the high CPU
>> frequencies observed by Jörg. I can get my system to about a target pstate of
>> 20 where it should have remained at 16, but that is about it.
>>
>>> The driver is processing samples for idle task for every 10ms and
>>> aperf/mperf are showing that we are always in turbo mode for idle task.
>>
>> That column pretty much always says "idle" (or swapper for my way of doing
>> things). I have not found it to very useful as an indicator, and considerably
>> more so since the utilization changes.
>>
>>>
>>> Need to find out why idle task is not sleeping.
>>
>> I contend that is it.
>
> Why?
>
> Unless I misunderstood, because the trace data indicates that the those CPUs
> are going into some deeper C stsate than C0 for most of their time.

But how long do they stay in those states every time?

Average residencies need to be well below 10 ms for the trace to be
produced every 10 ms, so the question seems to be what kicks the CPUs
out of idle states so often.  On a completely idle system, that's
highly suspicious.

Reply via email to