On Mon, Aug 06, 2012 at 07:07:23PM -0700, Thomas DiModica wrote: > I understand now: we want to immediately dequeue ourselves, even if it means > wasting cycles later by checking to ensure that we were dequeued. The last > thing we want is to return ETIMEDOUT when another thread has intervened, > and deququed us to wake for a condition_signal. But... why are we using the > mutex's lock, if we're queued on the condition? That's the bug.
One of them only. There are various subtleties concerning thread queueing and wakeup. For example, the comment in __pthread_cond_broadcast states that the queue can be walked without holding a lock, but when a timeout occurs, a thread will dequeue itself from that same queue, possibly concurrently (and as you can imagine, unsafely). Another problem concerns the wakeup itself : as stated in __pthread_cond_timedwait_internal, messages could be queued after a thread timed out, which can have various effects, the most obvious one being spurious wakeups (there is a chance for filled message queues blocking sending threads but I'm not sure it can happen in practice). My work on cthreads solves these issues by moving the wakeup operation in critical sections, using a per-thread flag to avoid sending more than one message. When a thread is about to return from condition_timedwait, a few cases must be tested. Obviously, the message delivered flag must be checked, as it also tells if the thread has already been dequeued by another one. But it doesn't mean the current thread didn't time out (it could have timed out, and then received a wakeup message before grabbing the condition lock). So if a timeout occurred, the message queue must be drained to avoid spurious wakeups. If the message delivered flag is unset, no other thread acted on the condition, so the thread must remove itself from the condition queue. It gets a bit more complicated with cancellation as a cancellation request can suspend the target thread, so the sigstate lock must be held before getting the condition lock. Otherwise, the thread could be suspended after having acquired the condition lock, and when the sender thread then tries to wake the target thread (by using condition_broadcast), it would deadlock on the condition lock. This is all explained in libthreads/cancel-cond.c. I'll apply those fixes to libpthread soon as they are perfectly relevant, since the algorithms are very similar (if not exactly the same). -- Richard Braun