On 07-12-20, 17:35, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > > First off, some cpufreq drivers (eg. intel_pstate) can pass hints > beyond the current target frequency to the hardware and there are no > provisions for doing that in the cpufreq framework. In particular, > today the driver has to assume that it should not allow the frequency > to fall below the one requested by the governor (or the required > capacity may not be provided) which may not be the case and which may > lead to excessive energy usage in some scenarios. > > Second, the hints passed by these drivers to the hardware need not be > in terms of the frequency, so representing the utilization numbers > coming from the scheduler as frequency before passing them to those > drivers is not really useful. > > Address the two points above by adding a special-purpose replacement > for the ->fast_switch callback, called ->adjust_perf, allowing the > governor to pass abstract performance level (rather than frequency) > values for the minimum (required) and target (desired) performance > along with the CPU capacity to compare them to. > > Also update the schedutil governor to use the new callback instead > of ->fast_switch if present. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > --- > > Changes with respect to the RFC: > - Don't pass "busy" to ->adjust_perf(). > - Use a special 'update_util' hook for the ->adjust_perf() case in > schedutil (this still requires an additional branch because of the > shared common code between this case and the "frequency" one, but > IMV this version is cleaner nevertheless). > > --- > drivers/cpufreq/cpufreq.c | 40 ++++++++++++++++++++++++++++++++ > include/linux/cpufreq.h | 14 +++++++++++ > include/linux/sched/cpufreq.h | 5 ++++ > kernel/sched/cpufreq_schedutil.c | 48 > +++++++++++++++++++++++++++++++-------- > 4 files changed, 98 insertions(+), 9 deletions(-) > > Index: linux-pm/include/linux/cpufreq.h > =================================================================== > --- linux-pm.orig/include/linux/cpufreq.h > +++ linux-pm/include/linux/cpufreq.h > @@ -320,6 +320,15 @@ struct cpufreq_driver { > unsigned int index); > unsigned int (*fast_switch)(struct cpufreq_policy *policy, > unsigned int target_freq); > + /* > + * ->fast_switch() replacement for drivers that use an internal > + * representation of performance levels and can pass hints other than > + * the target performance level to the hardware. > + */ > + void (*adjust_perf)(unsigned int cpu, > + unsigned long min_perf, > + unsigned long target_perf, > + unsigned long capacity);
With this callback in place, do we still need to keep the other stuff we introduced recently, like CPUFREQ_NEED_UPDATE_LIMITS ? -- viresh