On 9/29/2019 4:05 PM, Ken Brown wrote: > On 9/27/2019 10:12 AM, Ken Brown wrote: >> On 9/27/2019 9:37 AM, Norton Allen wrote: >>> On 9/26/2019 10:50 PM, Ken Brown wrote: >>>> >>>>> As a simple test example, consider: >>>>> >>>>> /bin/ssh-agent /bin/sleep 10 >>>>> >>>>> While the sleep is still running, ps shows: >>>>> >>>>> PID PPID PGID WINPID TTY UID STIME >>>>> COMMAND >>>>> 1694 1693 1694 1576 ? 22534 00:01:10 >>>>> /usr/bin/ssh-agent >>>>> 1653 1 1653 11740 cons1 22534 00:00:37 >>>>> /usr/bin/bash >>>>> 1693 1653 1693 1552 cons1 22534 00:01:10 >>>>> /usr/bin/sleep >>>>> >>>>> One oddity is that ssh-agent is listed as a subprocess of sleep >>>> ...but this isn't a bug. ssh-agent forks, and then the parent execs the >>>> command. >>> >>> With the salient difference presumably being that the exec is done in the >>> parent >>> instead of the child as usual? >> >> Yes. The idea is that 'ssh-agent command' should be more-or-less equivalent >> to >> running 'command', with ssh-agent running as a subprocess. >> >> The ssh-agent subprocess periodically checks to see if its parent is still >> alive, and it exits when the parent has died. Someone should figure out why >> this is not working on Cygwin. > > As an aid to someone who might want to debug this (probably Corinna when she > returns), I've created a test program agent.c (attached) that simulates the > relevant part of ssh-agent: > > 1. It forks a subprocess that periodically checks to see if its parent has > died, > and then exits. > > 2. The parent execs "/usr/bin/sleep 1". > > As with ssh-agent, the subprocess never detects that the parent has died, and > so > it never exits. > > Running this program under strace shows the following error in the pinfo > constructor: > > pinfo::pinfo: couldn't duplicate parent rd_proc_pipe handle 0x1BC for forked > child 1666 after exec, Win32 error 5 > > [Win32 error 5 is ERROR_ACCESS_DENIED.]
It seems that the pinfo constructor failure happens in cygheap_exec_info::reattach_children(). The latter is preceded by the following comment: /* Reattach non-reaped subprocesses passed in from the cygwin process which previously operated under this pid. FIXME: Is there a race here if the process exits during cygwin's exec handoff? */ I tried running my test program under gdb with a breakpoint at reattach_children, and the breakpoint was never hit. That gives an affirmative answer to the question in the FIXME. As a result, the exec'd program never becomes aware that it has a subprocess, so it exits without resetting the subprocess's ppid to 1. Is there someone out there familiar enough with Cygwin's exec to suggest a fix? It would be a nice gift to Corinna to get this fixed before her return. Ken -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple