On Wed, 15 Nov 2017 14:25:29 -0500 joe.ko...@concurrent-rt.com wrote: > 4.4.86-rt99's patch > > 0037-Intrduce-migrate_disable-cpu_light.patch > > introduces a place where a task's cpus_allowed mask is > updated without a corresponding update to nr_cpus_allowed. > > This path is executed when task affinity is changed while > migrate_disabled() is true. As there is no code present > to set nr_cpus_allowed when the migrate_disable state is > dropped, the scheduler at that point on may make incorrect > scheduling decisions for this task. > > My testing consists of temporarily adding a > > if (tsk_nr_cpus_allowed(p) == cpumask_weight(tsk_cpus_allowed(p)) > printk_ratelimited(...)
Have you tested v4.9-rt or 4.13-rt if it has the same bug? If it is a bug in 4.13-rt then it needs to go there first, and then backported to the stable releases (which I'm actually working on now). -- Steve > > stmt to schedule() and running a simple affinity rotation > program I wrote, one that rotates the threads of stress(1). > While rotating, I got the expected kernel error messages. > With this patch applied the messages disappeared. > > Signed-off-by: Joe Korty <joe.ko...@concurrent-rt.com> > > Index: b/kernel/sched/core.c > =================================================================== > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1220,6 +1220,7 @@ void do_set_cpus_allowed(struct task_str > lockdep_assert_held(&p->pi_lock); > > if (__migrate_disabled(p)) { > + p->nr_cpus_allowed = cpumask_weight(new_mask); > cpumask_copy(&p->cpus_allowed, new_mask); > return; > }