On 07/14, Mathieu Desnoyers wrote:
>
> * Oleg Nesterov ([EMAIL PROTECTED]) wrote:
> > On 07/14, Mathieu Desnoyers wrote:
> > >
> > > @@ -4891,10 +4948,42 @@ static int migration_thread(void *data)
> > >           list_del_init(head->next);
> > >  
> > >           spin_unlock(&rq->lock);
> > > -         __migrate_task(req->task, cpu, req->dest_cpu);
> > > +         migrated = __migrate_task(req->task, cpu, req->dest_cpu);
> > >           local_irq_enable();
> > > -
> > > -         complete(&req->done);
> > > +         if (!migrated) {
> > > +                 /*
> > > +                  * If the process has not been migrated, let it run
> > > +                  * until it reaches a migration_check() so it can
> > > +                  * wake us up.
> > > +                  */
> > > +                 spin_lock_irq(&rq->lock);
> > > +                 head = &rq->migration_queue;
> > > +                 list_add(&req->list, head);
> > > +                 if (req->task->se.on_rq
> > > +                                 || !task_migrate_count(req->task)) {
> > > +                         /*
> > > +                          * The process is on the runqueue, it could
> > > +                          * exit its critical section at any moment,
> > > +                          * don't race with it and retry actively.
> > > +                          * Also, if the thread is not on the runqueue
> > > +                          * and has a zero migration count
> > > +                          * (__migrate_task failed because cpus allowed
> > > +                          * changed), just retry.
> > > +                          */
> > > +                         spin_unlock_irq(&rq->lock);
> > > +                         continue;
> > 
> > Again, this can deadlock. migration_thread() is SCHED_FIFO, and it shares 
> > the
> > same CPU with req->task. We are doing a busy-wait loop, req->task may have 
> > no
> > chance to finish its critical section.
> > 
> 
> If we share the CPU with the other thread, it means that it won't be on
> the runqueue while we are holding the rq lock.

Why? The req->task could be runnable, but preempted by migration_thread().
In that case req->task->se.on_rq should be true.

I didn't read the new scheduler yet, but I belive on_rq == 0 only when
the task sleeps, it is like the current ->array = NULL. Please correct
me if I am wrong.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to