2016-04-01 14:40 GMT+02:00 Rafael J. Wysocki <r...@rjwysocki.net>: > On Friday, April 01, 2016 11:20:42 AM Jörg Otte wrote: >> 2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki <r...@rjwysocki.net>: >> > On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote: >> >> 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki <r...@rjwysocki.net>: >> >> > On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote: >> >> > >> >> > [cut] >> >> > >> >> >> > >> >> >> >> >> >> Yes, works for me. >> >> >> >> >> >> CPUID(7): No-SGX >> >> >> CPU Avg_MHz Busy% Bzy_MHz TSC_MHz >> >> >> - 11 0.66 1682 2494 >> >> >> 0 11 0.60 1856 2494 >> >> >> 1 6 0.34 1898 2494 >> >> >> 2 13 0.82 1628 2494 >> >> >> 3 13 0.87 1528 2494 >> >> >> CPU Avg_MHz Busy% Bzy_MHz TSC_MHz >> >> >> - 6 0.58 963 2494 >> >> >> 0 8 0.83 957 2494 >> >> >> 1 1 0.08 984 2494 >> >> >> 2 10 1.04 975 2494 >> >> >> 3 3 0.35 934 2494 >> >> >> >> >> > >> > >> > [cut] >> > >> >> > >> >> >> >> No, this patch doesn't help. >> > >> > Well, more work to do then. >> > >> > I've just noticed a bug in this patch, which is not relevant for the >> > results, >> > but below is a new version. >> > >> >> CPUID(7): No-SGX >> >> CPU Avg_MHz Busy% Bzy_MHz TSC_MHz >> >> - 8 0.32 2507 2495 >> >> 0 13 0.53 2505 2495 >> >> 1 3 0.11 2523 2495 >> >> 2 1 0.06 2555 2495 >> >> 3 15 0.59 2500 2495 >> >> CPU Avg_MHz Busy% Bzy_MHz TSC_MHz >> >> - 8 0.33 2486 2495 >> >> 0 12 0.50 2482 2495 >> >> 1 5 0.22 2489 2495 >> >> 2 1 0.04 2492 2495 >> >> 3 15 0.59 2487 2495 >> > > > [cut] > >> >> here they are. >> > > Thanks! > > First of all, the sampling mechanics works as expected in the failing case, > which is the most important thing I wanted to know. However, there are > anomalies > in the failing case trace. The core_busy column is clearly suspicious and it > looks like CPUs 2 and 3 never really go idle. I guess we'll need to find out > why they don't go idle to get to the bottom of this, but it firmly falls into > the weird stuff territory already. > > In the meantime, below is one more patch to test, on top of the previous one > (that is, https://patchwork.kernel.org/patch/8714401/). > > Again, this is a change I'd like to make regardless, so it would be good to > know if anything more has to be done before we go further. > > --- > From: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > Subject: [PATCH] intel_pstate: Avoid extra invocation of intel_pstate_sample() > > The initialization of intel_pstate for a given CPU involves populating > the fields of its struct cpudata that represent the previous sample, > but currently that is done in a problematic way. > > Namely, intel_pstate_init_cpu() makes an extra call to > intel_pstate_sample() so it reads the current register values that > will be used to populate the "previous sample" record during the > next invocation of intel_pstate_sample(). However, after commit > a4675fbc4a7a (cpufreq: intel_pstate: Replace timers with utilization > update callbacks) that doesn't work for last_sample_time, because > the time value is passed to intel_pstate_sample() as an argument now. > Passing 0 to it from intel_pstate_init_cpu() is problematic, because > that causes cpu->last_sample_time == 0 to be visible in > get_target_pstate_use_performance() (and hence the extra > cpu->last_sample_time > 0 check in there) and effectively allows > the first invocation of intel_pstate_sample() from > intel_pstate_update_util() to happen immediately after the > initialization which may lead to a significant "turn on" > effect in the governor algorithm. > > To mitigate that issue, rework the initialization to avoid the > extra intel_pstate_sample() call from intel_pstate_init_cpu(). > Instead, make intel_pstate_sample() return false if it has been > called with cpu->sample.time equal to zero, which will make > intel_pstate_update_util() skip the sample in that case, and > reset cpu->sample.time from intel_pstate_set_update_util_hook() > to make the algorithm start properly every time the hook is set. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > --- > drivers/cpufreq/intel_pstate.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -910,7 +910,14 @@ static inline bool intel_pstate_sample(s > cpu->prev_aperf = aperf; > cpu->prev_mperf = mperf; > cpu->prev_tsc = tsc; > - return true; > + /* > + * First time this function is invoked in a given cycle, all of the > + * previous sample data fields are equal to zero or stale and they > must > + * be populated with meaningful numbers for things to work, so assume > + * that sample.time will always be reset before setting the > utilization > + * update hook and make the caller skip the sample then. > + */ > + return !!cpu->last_sample_time; > } > > static inline int32_t get_avg_frequency(struct cpudata *cpu) > @@ -984,8 +991,7 @@ static inline int32_t get_target_pstate_ > * enough period of time to adjust our busyness. > */ > duration_ns = cpu->sample.time - cpu->last_sample_time; > - if ((s64)duration_ns > pid_params.sample_rate_ns * 3 > - && cpu->last_sample_time > 0) { > + if ((s64)duration_ns > pid_params.sample_rate_ns * 3) { > sample_ratio = div_fp(int_tofp(pid_params.sample_rate_ns), > int_tofp(duration_ns)); > core_busy = mul_fp(core_busy, sample_ratio); > @@ -1100,7 +1106,6 @@ static int intel_pstate_init_cpu(unsigne > intel_pstate_get_cpu_pstates(cpu); > > intel_pstate_busy_pid_reset(cpu); > - intel_pstate_sample(cpu, 0); > > cpu->update_util.func = intel_pstate_update_util; > > @@ -1121,9 +1126,13 @@ static unsigned int intel_pstate_get(uns > return get_avg_frequency(cpu); > } > > -static void intel_pstate_set_update_util_hook(unsigned int cpu) > +static void intel_pstate_set_update_util_hook(unsigned int cpu_num) > { > - cpufreq_set_update_util_data(cpu, &all_cpu_data[cpu]->update_util); > + struct cpudata *cpu = all_cpu_data[cpu_num]; > + > + /* Prevent intel_pstate_update_util() from using stale data. */ > + cpu->sample.time = 0; > + cpufreq_set_update_util_data(cpu_num, &cpu->update_util); > } > > static void intel_pstate_clear_update_util_hook(unsigned int cpu) >
Done. Attached the tracer. For me it looks like the previous one of the failing case. Thanks, Jörg
intel_pstate_tracer_3.xz
Description: application/xz