On Thu, Jun 25, 2020 at 17:43:51 +0200, Qais Yousef <qais.you...@arm.com> wrote...
> struct uclamp_rq was zeroed out entirely in assumption that in the first > call to uclamp_rq_inc() they'd be initialized correctly in accordance to > default settings. Perhaps I was not clear in my previous comment: https://lore.kernel.org/lkml/87sgekorfq.derkl...@matbug.net/ when I did say: Does not this means the problem is more likely with uclamp_rq_util_with(), which should be guarded? I did not mean that we have to guard the calls to that function but instead that we should just make that function aware of uclamp being opted in or not. > But when next patch introduces a static key to skip > uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil > will fail to perform any frequency changes because the > rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which > means all rqs are capped to 0 by default. The initialization you wants to do here it's needed because with the current approach you keep calling the same uclamp_rq_util_with() and keep doing min/max aggregations even when uclamp is not opted in. But this means also that we have min/max aggregation _when not really required_. > Fix it by making sure we do proper initialization at init without > relying on uclamp_rq_inc() doing it later. My proposal was as simple as: ---8<--- static __always_inline unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util, struct task_struct *p) { unsigned long min_util = READ_ONCE(rq->uclamp[UCLAMP_MIN].value); unsigned long max_util = READ_ONCE(rq->uclamp[UCLAMP_MAX].value); + if (!static_branch_unlikely(&sched_uclamp_used)) + return rt_task(p) ? uclamp_none(UCLAMP_MAX) : util if (p) { min_util = max(min_util, uclamp_eff_value(p, UCLAMP_MIN)); max_util = max(max_util, uclamp_eff_value(p, UCLAMP_MAX)); } /* * Since CPU's {min,max}_util clamps are MAX aggregated considering * RUNNABLE tasks with _different_ clamps, we can end up with an * inversion. Fix it now when the clamps are applied. */ if (unlikely(min_util >= max_util)) return min_util; return clamp(util, min_util, max_util); } ---8<--- Such small change is more self-contained IMHO and does not remove an existing optimizations like this lazy RQ's initialization at first usage. Moreover, it can folded in the following patch, with all the other static keys shortcuts.