Hi Tejun, On Fri, Dec 28, 2018 10:03 AM, Tejun Heo wrote: > > On Thu, Dec 27, 2018 at 05:53:52PM -0800, Tejun Heo wrote: > > Vincent knows that part way better than me but I think the safest way > > would be doing the optimization removal iff tmp_alone_branch is > > already pointing to leaf_cfs_rq_list. IIUC, it's pointing to > > something else only while a branch is being built and deferring > > optimization removal by an avg update cycle isn't gonna make any > > difference anyway. > > So, something like the following. Xie, can you see whether the > following patch resolves the problem? > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d1907506318a..88b9118b5191 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7698,7 +7698,8 @@ static void update_blocked_averages(int cpu) > * There can be a lot of idle CPU cgroups. Don't let fully > * decayed cfs_rqs linger on the list. > */ > - if (cfs_rq_is_decayed(cfs_rq)) > + if (cfs_rq_is_decayed(cfs_rq) && > + rq->tmp_alone_branch == &rq->leaf_cfs_rq_list) > list_del_leaf_cfs_rq(cfs_rq); > > /* Don't need periodic decay once load/util_avg are null */ Tested-by: Zhipeng Xie <xiezhipe...@huawei.com>
This patch fixes things for me, we haven't seen a crash yet. -- Thanks, Zhipeng Xie