Paul E. McKenney wrote:
>
> > The other thing that jumped out at me is that signals are very different
> > animals from a locking viewpoint depending on whether they are:
> >
> > 1.  ignored,
> >
> > 2.  caught by a single thread,
> >
> > 3.  fatal to multiple threads/processes (though I don't know
> >     of anything that shares sighand_struct between separate
> >     processes), or
> >
> > 4.  otherwise global to multiple threads/processes (such as
> >     SIGSTOP and SIGCONT).
> >
> > And there are probably other distinctions that I have not yet caught
> > on to.
> >
> > One way to approach this would be to make your suggested lock_task_sighand()
> > look at the signal and acquire the appropriate locks.  If, having acquired
> > a given set of locks, it found that the needed set had changed (e.g., due
> > to racing exec() or sigaction()), then it drops the locks and retries.
>
> OK, for this sort of approach to work, lock_task_sighand() must take and
> return some sort of mask indicating what locks are held.  The mask returned
> by lock_task_sighand() would then be passed to an unlock_task_sighand().

Sorry, I don't understand you. CLONE_THREAD implies CLONE_SIGHAND,
so we always need to lock one ->sighand. Could you please clarify?

> > > + if (!ret && sig && (sp = p->sighand)) {
> > >           if (!get_task_struct_rcu(p)) {
> > >                   return -ESRCH;
> > >           }
> > > -         spin_lock_irqsave(&p->sighand->siglock, flags);
> > > +         spin_lock_irqsave(&sp->siglock, flags);
> > > +         if (p->sighand != sp) {
> > > +                 spin_unlock_irqrestore(&sp->siglock, flags);
> > > +                 put_task_struct(p);
> > > +                 goto retry;
> > > +         }
> > >           ret = __group_send_sig_info(sig, info, p);
> > > -         spin_unlock_irqrestore(&p->sighand->siglock, flags);
> > > +         spin_unlock_irqrestore(&sp->siglock, flags);
> > >           put_task_struct(p);
> >
> > Do we really need get_task_struct_rcu/put_task_struct here?
> >
> > The task_struct can't go away under us, it is rcu protected.
> > When ->sighand is locked, and it is still the same after
> > the re-check, it means that 'p' has not done __exit_signal()
> > yet, so it is safe to send the signal.
> >
> > And if the task has ->usage == 0, it means that it also has
> > ->sighand == NULL, and your code will notice that.
> >
> > No?
>
> Seems plausible.  I got paranoid after seeing the lock dropped in
> handle_stop_signal(), though.

Yes, this is bad and should be fixed, I agree.

But why do you think we need to bump task->usage? It can't make any
difference, afaics. The task_struct can't dissapear, the caller was
converted to use rcu_read_lock() or it holds tasklist_lock.

Nonzero task_struct->usage can't stop do_exit or sys_wait4, it will
only postpone call_rcu(__put_task_struct_cb).

And after we locked ->sighand we have sufficient memory barrier, so
if we read the stale value into 'sp' we will notice that (if you were
worried about this issue).

Am I missed something?

>  void exit_sighand(struct task_struct *tsk)
>  {
>       write_lock_irq(&tasklist_lock);
> -     __exit_sighand(tsk);
> +     spin_lock(&tsk->sighand->siglock);
> +     if (tsk->sighand != NULL) {
> +             __exit_sighand(tsk);
> +     }
> +     spin_unlock(&tsk->sighand->siglock);
>       write_unlock_irq(&tasklist_lock);
>  }

Very strange code. Why this check? And what happens with

        spin_unlock(&tsk->sighand->siglock);

when tsk->sighand == NULL ?

Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to