Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Tue, 2005-08-23 at 20:17 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > > I still think the last patch I sent is still necessary. > > Thomas, you know that I like this change in __exit_{signal,sighand}, > but i think this change is dangerous, should go in a separate patch, > and needs a

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Oleg Nesterov
Thomas Gleixner wrote: > > On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: > > > > kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after > > that nobody can access this timer, so we don't need to lock timer->it_lock > > at all in this case. No lock - no deadlock. > > It sti

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: > But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work > for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now. > However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so > may be it ca

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > Ok, exit_itimers()->itimer_delete() called when the last thread exits > or does exec. > > kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after > that nobody can access this timer, so we don't need

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: > > But we can not check for p->sighand == NULL, as sighand is released > after exit_itimers() so we are still deadlock prone. So I think > __exit_sighand() should be called before exit_itimers(). Then we can do > > retry: > if (unlikely(!p->sighand)) >

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: > > @@ -1427,7 +1434,18 @@ send_group_sigqueue(int sig, struct sigq > int ret = 0; > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > - read_lock(&tasklist_lock); > +retry: > + if (unlikely(p->flags & PF_EXITING)) > + return -1; > + I

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Thomas Gleixner
On Mon, 2005-08-22 at 10:39 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > > > > @@ -1427,7 +1434,18 @@ send_group_sigqueue(int sig, struct sigq > > int ret = 0; > > > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > - read_lock(&tasklist_lock); > > +retry: > > +

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Thomas Gleixner
Oleg, On Mon, 2005-08-22 at 12:52 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > I think yes, and this is closely related to your and Paul > "Use RCU to protect tasklist for unicast signals" patches. > > We don't need RCU here, but we do need this hypothetical > lock_task_sighand() (and _

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: > > It exists. It triggers on preempt-RT and I can trigger it on vanilla SMP > by waiting for the timer expiry in release_task() before the > __exit_signal() call. That's reasonable, as it can happen that way by > chance too. It requires that the timer expires on a different

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sun, 2005-08-21 at 23:24 +0200, Thomas Gleixner wrote: > Oleg, Sorry for making noise. I introduced the race again. tglx --- linux-2.6.13-rc6.signal/kernel/signal.oleg.c2005-08-21 23:07:03.0 +0200 +++ linux-2.6.13-rc6.signal/kernel/signal.c 2005-08-21 23:48:52.0

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
Oleg, On Sun, 2005-08-21 at 14:59 +0400, Oleg Nesterov wrote: > CPU_0 CPU_1 > > release_task: posix_timer_fn: > write_lock(tasklist); lock(timer->it_lock); > > exit_timers:send_sigxx

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sun, 2005-08-21 at 14:41 +0400, Oleg Nesterov wrote: > If not: > > > timer event > > timr->it_process references a freed structure > > > > No, create_timer() does get_task_struct(timr->it_process), this > task may be EXIT_DEAD now, but the task_struct itself is valid, > and it's ->fl

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Oleg Nesterov
Thomas Gleixner wrote: > > On Sat, 2005-08-20 at 20:58 +0400, Oleg Nesterov wrote: > > [PATCH] fix send_sigqueue() vs thread exit race > > > > . > > - read_lock(&tasklist_lock); > > + read_lock(&tasklist_lock); > > + > > + if (unlikely(p->flags & PF_EXITING)) { > > + re

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sat, 2005-08-20 at 20:58 +0400, Oleg Nesterov wrote: > [PATCH] fix send_sigqueue() vs thread exit race > > posix_timer_event() first checks that the thread (SIGEV_THREAD_ID > case) does not have PF_EXITING flag, then it calls send_sigqueue() > which locks task list. But if the thread exits in b