[ Followups to -current ]

On Thu, 8 Nov 2001, Loren James Rittle wrote:

> Hello all,
> 
> I have ported the most recent version of boehm-gc (6.1-alpha) to
> FreeBSD/i386 under the auspice of the gcc project (it will be in Hans'
> 6.1 release and it is on the gcc mainline).  I got one notable thing
> fully configured beyond what is in the ports tree (which is based on
> 6.0): threaded GC is now supported.  However, this work has uncovered
> either a rare race condition in the 4.X pthread implementation (also
> seen on a current 5.0 system) or a bad assumption in the GC signal
> code (abstracted below).  Either way, the result seen is an undetected
> deadlock.  With the following new assertion, I can at least force the
> condition to be detectable in many cases where it would have locked up.
> 
> Two questions come to mind: Is there any condition under which my new
> assumption should not be true?  Is there any obvious mistake that a
> threaded application can make (perhaps related to its signal use) that
> could cause the new assumption to ever be violated?
> 
> I have also seen what I thought was a less important issue, but I now
> see that it is probably related.  After reviewing the FreeBSD uthread
> source code, the issue appears to be a race between the pthread_exit()
> code running in one thread and the pthread_join() code running in
> another thread in conjunction with a sigsuspend() call occurring on a
> signal handler of that second thread.  Under some conditions, an
> errant EINTR would be returned to the pthread_join() caller instead of
> the exit code from the terminated thread.  Under other timing
> conditions, you get the deadlock spotted with the above new assertion.

Try the following patch; this is to -current, you'll have to massage
it a bit for -stable (Hint: s/curthread/_thread_run/ in -stable).

-- 
Dan Eischen

Index: uthread/pthread_private.h
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/pthread_private.h,v
retrieving revision 1.63
diff -u -r1.63 pthread_private.h
--- uthread/pthread_private.h   26 Oct 2001 21:19:22 -0000      1.63
+++ uthread/pthread_private.h   10 Nov 2001 14:35:48 -0000
@@ -601,6 +601,11 @@
        /* XXX - What about thread->timeout and/or thread->error? */
 };
 
+struct join_status {
+       struct pthread  *thread;
+       int             ret;
+       int             error;
+};
 
 /*
  * Normally thread contexts are stored as jmp_bufs via 
_setjmp()/_longjmp(),
@@ -757,8 +762,12 @@
         */
        int     error;
 
-       /* Pointer to a thread that is waiting to join (NULL if no joiner). */
-       struct pthread *joiner;
+       /*
+        * The joiner is the thread that is joining to this thread.  The
+        * join status keeps track of a join operation to another thread.
+        */
+       struct pthread          *joiner;
+       struct join_status      join_status;
 
        /*
         * The current thread can belong to only one scheduling queue at
Index: uthread/uthread_exit.c
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/uthread_exit.c,v
retrieving revision 1.23
diff -u -r1.23 uthread_exit.c
--- uthread/uthread_exit.c      20 May 2001 23:08:32 -0000      1.23
+++ uthread/uthread_exit.c      10 Nov 2001 14:36:50 -0000
@@ -220,8 +220,9 @@
                }
 
                /* Set the return value for the joining thread: */
-               pthread->ret = curthread->ret;
-               pthread->error = 0;
+               pthread->join_status.ret = curthread->ret;
+               pthread->join_status.error = 0;
+               pthread->join_status.thread = NULL;
 
                /* Make this thread collectable by the garbage collector. */
                PTHREAD_ASSERT(((curthread->attr.flags & PTHREAD_DETACHED) ==
Index: uthread/uthread_join.c
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/uthread_join.c,v
retrieving revision 1.19
diff -u -r1.19 uthread_join.c
--- uthread/uthread_join.c      16 Aug 2001 06:31:32 -0000      1.19
+++ uthread/uthread_join.c      10 Nov 2001 14:36:14 -0000
@@ -122,18 +122,20 @@
                pthread->joiner = curthread;
 
                /* Keep track of which thread we're joining to: */
-               curthread->data.thread = pthread;
+               curthread->join_status.thread = pthread;
 
-               /* Schedule the next thread: */
-               _thread_kern_sched_state(PS_JOIN, __FILE__, __LINE__);
+               while (curthread->join_status.thread == pthread) {
+                       /* Schedule the next thread: */
+                       _thread_kern_sched_state(PS_JOIN, __FILE__, __LINE__);
+               }
 
                /*
                 * The thread return value and error are set by the thread we're
                 * joining to when it exits or detaches:
                 */
-               ret = curthread->error;
+               ret = curthread->join_status.error;
                if ((ret == 0) && (thread_return != NULL))
-                       *thread_return = curthread->ret;
+                       *thread_return = curthread->join_status.ret;
        } else {
                /*
                 * The thread exited (is dead) without being detached, and no
Index: uthread/uthread_sig.c
===================================================================
RCS file: /opt/d/CVS/src/lib/libc_r/uthread/uthread_sig.c,v
retrieving revision 1.38
diff -u -r1.38 uthread_sig.c
--- uthread/uthread_sig.c       29 Jun 2001 17:09:07 -0000      1.38
+++ uthread/uthread_sig.c       10 Nov 2001 14:28:09 -0000
@@ -671,7 +671,6 @@
         * signal handler to run:
         */
        case PS_COND_WAIT:
-       case PS_JOIN:
        case PS_MUTEX_WAIT:
                /*
                 * Remove the thread from the wait queue.  It will
@@ -679,6 +678,17 @@
                 * handlers have been invoked.
                 */
                PTHREAD_WAITQ_REMOVE(pthread);
+               break;
+
+       case PS_JOIN:
+               /*
+                * Remove the thread from the wait queue.  It will
+                * be added back to the wait queue once all signal
+                * handlers have been invoked.
+                */
+               PTHREAD_WAITQ_REMOVE(pthread);
+               /* Make the thread runnable: */
+               PTHREAD_SET_STATE(pthread, PS_RUNNING);
                break;
 
        /*


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to