Re: open-pipe deadlocked

rixed Fri, 02 Sep 2011 02:26:57 -0700

> For 1.8, could you try running Helgrind and see what happens?

Helgrind complains about loads of 'possible data race' but does not
detect anything wrong when the actual deadlock occurs. When I exit
the program it does tell that a threads still own some lock, but
does not reveal the addresses of those in a meaningfull way for me:


==26762== Thread #1: Exiting thread still holds 1 lock
==26762==    at 0x5A81B4D: waitpid (waitpid.c:41)
==26762==    by 0x4F0A289: scm_waitpid (posix.c:560)
==27182==    by 0x5A7BF09: pthread_mutex_lock (pthread_mutex_lock.c:61)
==26762==    by 0x4E8FCBF: deval (eval.c:4229)
==27182==    by 0x4C25BEF: pthread_mutex_lock (hg_intercepts.c:488)
==27182==    by 0x4EF6606: scm_i_thread_put_to_sleep (threads.c:1676)
==26762==    by 0x4E89B4F: scm_i_eval_x (eval.c:5900)
==27182==    by 0x4E96D93: scm_i_gc (gc.c:550)
==27182==    by 0x4E96CBC: scm_gc_for_newcell (gc.c:507)
==26762==    by 0x4E8FCED: deval (eval.c:4232)
==27182==    by 0x4EAC1B8: scm_cell (inline.h:122)
==26762==    by 0x4E89C62: scm_i_eval (eval.c:5910)
==26762==    by 0x4E710D7: scm_start_stack (debug.c:457)
==26762==    by 0x4E71199: scm_m_start_stack (debug.c:473)
==26762==
==27182==    by 0x4E91F5E: scm_dapply (eval.c:5012)
==27182==

(how pthread_mutex_lock apears to call scm_waitpid is not clear to me)

I don't know how helgrind works exactly, and thus can not be sure
its supposed to detect when a thread lock a mutex it already owns
(especially after a fork).

As to why it does not happen with guile2, this is still a mystery. My
theory about this deadlock is that the thread that calls open-process
owns the scm_i_port_table_mutex when open-process is called, and thus
the port-for-each call deadlock. But since
guile2's open-process does the same fork (not vfork), takes the same
scm_i_port_table_mutex in port-for-each, which mutex is still not
recursive, and yet does not deadlock, then maybe my theory is wrong in
the first place - or maybe the path that calls open-process while
scm_i_port_table_mutex is locked disapeared in guile2, maybe due to the
change of garbage collector (since the GC also grab this lock I
believe). Or maybe the deadlock involves another lock in addition to
this one. I'm going to turn scm_i_port_table_mutex into a recursive
mutex in order to try to invalidate my theory. sorry I'm thinking aloud
but maybe this can give you some better idea?

Re: open-pipe deadlocked

Reply via email to