We've come across a very subtle problem where child processes will fail after some time. On our hyperthreaded systems, the child process fails by consuming one thread, and degrading the system (typically locking up the desktop) and power-off is the only recovery.
We use cygwin to provide a build system wrapper. Originally the implementation would perform some configuration, then exec() the build supervisor, and the build would proceed.
bash -> fork/exec -> wrapper -> exec -> supervisor -> etc
This scenario fails as described above. The failure occurs after some variable time, but typically within 10s of minutes.
We've noticed a couple of ways to workaround the problem:
cmd.exe -> wrapper -> exec -> supervisor -> etc bash -> fork/exec -> wrapper -> fork/exec -> supervisor -> etc bash -> fork/exec -> wrapper -> spawn(_P_WAIT) -> supervisor -> etc
By experiment, it seems that the key to the failure is the sequence of exec/exec, but I do not know how that corrupts the system so badly that power off is the only recourse.
Earl
-- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/