On Thursday, September 7, 2017 11:26:16 AM CEST Peter Zijlstra wrote: > On Thu, Sep 07, 2017 at 11:13:38AM +0200, Peter Zijlstra wrote: > > Subject: sched/cpuset/pm: Fix cpuset vs suspend-resume > > > > Cpusets vs suspend-resume is _completely_ broken. And it got noticed > > because it now resulted in non-cpuset usage breaking too. > > > > On suspend cpuset_cpu_inactive() doesn't call into > > cpuset_update_active_cpus() because it doesn't want to move tasks about, > > there is no need, all tasks are frozen and won't run again until after > > we've resumed everything. > > > > But this means that when we finally do call into > > cpuset_update_active_cpus() after resuming the last frozen cpu in > > cpuset_cpu_active(), the top_cpuset will not have any difference with > > the cpu_active_mask and this it will not in fact do _anything_. > > > > So the cpuset configuration will not be restored. This was largely > > hidden because we would unconditionally create identity domains and > > mobile users would not in fact use cpusets much. And servers what do use > > cpusets tend to not suspend-resume much. > > > > An addition problem is that we'd not in fact wait for the cpuset work to > > finish before resuming the tasks, allowing spurious migrations outside > > of the specified domains. > > > > Fix the rebuild by introducing cpuset_force_rebuild() and fix the > > ordering with cpuset_wait_for_hotplug(). > > > > Cc: t...@kernel.org > > Cc: r...@rjwysocki.net > > Cc: efa...@gmx.de > > Reported-by: Andy Lutomirski <l...@kernel.org> > > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > > TJ, I _think_ it was commit: > > deb7aa308ea2 ("cpuset: reorganize CPU / memory hotplug handling") > > That wrecked things, but there's been so much changes in this area it is > really hard to tell. Note how before that commit it would > unconditionally rebuild the domains, and you 'optimized' that ;-) > > That commit also introduced the work to do the async rebuild and failed > to do that flush on resume. > > In any case, I think we should put a fixes tag on this commit such that > it gets picked up into stable kernels. Not sure anybody will try and > backport it into 4 year old kernels, but who knows. >
Many thanks for fixing this!