Hi -

Looks like there's a race condition when we destroy a condition variable.
My understanding of the expected behavior is that once all the threads have
been signaled (i.e, pthread_cond_broadcast is called), the condition
variable can be safely destroyed with pthread_cond_destroy.

The problem is in glibc's libpthread/sysdeps/generic/pt-cond-timedwait.c.
After __pthread_block() returns, we spinlock on cond->__lock.  The problem
is that our __pthread_block() is just a mach_msg receive, and our
__pthread_wakeup() (called by pthread_cond_broadcast) is just a mach_msg
send.

So we can do a pthread_cond_broadcast, which will send messages to all
waiting threads, but there's no guarantee that the threads have received
the message; the message could be queued.  Then we destroy the condition
variable, then the thread receives the message and tries on spinlock on a
free'd region of memory.

It looks like the whole reason for that spinlock is to figure out if
somebody else removed us from the wait queue, and to remove ourselves from
the wait queue if they did not (i.e, we timed out).

I'm puzzling about how to fix it, other than by reorganizing my libpager
code so that condition variables don't get destroyed very often.

Also, I'm a bit confused by the management of the source code.  Is the
authoritative copy at git://git.savannah.gnu.org:/hurd/libpthread.git?

    agape
    brent

Reply via email to