On Thu, Aug 16, 2012 at 08:20:37PM +0400, Andrey Khalyavin wrote: >On Wed, 15 Aug 2012 10:11:16 -0400, Christopher Faylor wrote: >>On Wed, Aug 15, 2012 at 04:54:42PM +0400, Andrey Khalyavin wrote: >>>I finally got a cygwin crash dump from our build bots. It shows, that >>>cygwin1.dll crashes in kill_pgrp function on line: >>> (pid > 1 && p->pgid != pid) || >>>where p is a pointer to _pinfo. This function enumerates all _pinfo's >>>and executes this line for all of them which pass p->exists() check. >>>In crash dump p points to _pinfo that has process_state equal to >>>PID_IN_USE | PID_EXECED. >> >>Thanks for tracking this down. I've added a check for "execed" to >>_pinfo::exists. >> >>cgf >I updated core libraries from 20120803 snapshot to 20120815 snapshot >and now bash crashes when I execute rm -rf dir. Reproducibility is >strange. It crashed for hours when I entered >cd /tmp >mkdir a >rm -rf a >commands but now suddenly stopped crashing in this case. >It is still crashes on rm -rf in the real script we use though. > >Crash happens in setup_handler function on line > HANDLE hth = (HANDLE) *tls; >because tls->tid equals to zero. Definition of this operation is in >sygtls.h: operator HANDLE () const {return tid->win32_obj_id;}. >setup_handler is called from sigpacket::process which in turn >called from wait_sig. Signal number is 20, signal code is 28. >All fields of tls structure are zero with exception stacklock equal >to 1 and stackptr equal to address of tls->stack.
Sounds like a race between thread creation and signal handling. I have added some defensive code in the latest snapshot. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple