Hi everyone, While I was testing and debugging some of the SRFI-18 code that Neil and I were working on, I noticed a deadlock that happens in scm_join_thread_timed. I'm pretty sure it affects the 1.8 codebase as well, although it's probably more common when doing timed joins.
Thread joining in Guile (1.9 or 1.8) works as follows: 1. If the target thread has exited, return. 2. Block on the target thread's join queue. 3. When woken (because of a pthread_cond_signal, a spurious pthreads wakeup, or, in 1.9, a timeout expiration), check the target thread's exit status -- if it has exited, return. 4. Otherwise, SCM_TICK. 5. Go to step 2. The deadlock can happen if the thread exits during the tick, because there's no check of the exit status before block_self is called again. I'm pretty sure that moving step 1 into the beginning of the loop would fix this -- I can submit a patch against 1.8, 1.9, or both. Let me know what you guys would like. Regards, Julian
