On Thu, Sep 24, 2020 at 08:59:33PM +0100, Valentin Schneider wrote: > > @@ -2025,19 +2138,8 @@ static int __set_cpus_allowed_ptr(struct > > if (cpumask_test_cpu(task_cpu(p), new_mask)) > > goto out; > > I think this needs a cancellation of any potential pending migration > requests. Consider a task P0 running on CPU0: > > P0 P1 P2 > > migrate_disable(); > <preempt> > set_cpus_allowed_ptr(P0, CPU1); > // waits for completion > > set_cpus_allowed_ptr(P0, CPU0); > // Already good, > no waiting for completion > <resumes> > migrate_enable(); > // task_cpu(p) allowed, no move_task() > > AIUI in this scenario P1 would stay forever waiting.
Hurmph, looking at it, I think you're right. But I'm fairly sure I did test that, maybe I just didn't run it long enough to hit the window ... > I *think* this can be > cured by making this function slightly more hideous: It's a real beauty isn't it :-/ > --- > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 01113e6f941f..829334f00f7b 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2102,6 +2102,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, > u32 flags) > { > const struct cpumask *cpu_valid_mask = cpu_active_mask; > + struct set_affinity_pending *pending; > + bool cancel_pending = false; > unsigned int dest_cpu; > struct rq_flags rf; > struct rq *rq; > @@ -2158,14 +2160,20 @@ static int __set_cpus_allowed_ptr(struct task_struct > *p, > } > > /* Can the task run on the task's current CPU? If so, we're done */ > - if (cpumask_test_cpu(task_cpu(p), new_mask)) > + if (cpumask_test_cpu(task_cpu(p), new_mask)) { > + cancel_pending = true; > goto out; > + } > > return move_task(rq, &rf, p, dest_cpu, flags); > > out: > + pending = p->migration_pending; > task_rq_unlock(rq, p, &rf); > > + if (cancel_pending && pending) > + complete_all(&pending->done); > + > return ret; > } He who completes pending should also clear ->migration_pending, otherwise the next caller will be able to observe a dangling pointer. The other approach is trying to handle that last condition in move_task(), but I'm quite sure that's going to be aweful too :/