On Mon, Feb 13, 2012 at 02:50:45PM -0800, Dmitry Mikulin wrote: > >>>It seems that now wait4(2) can be called from the real (non-debugger) > >>>parent first and result in the call to proc_reap(), isn't it ? We would > >>>then just reparent the child back to the caller, still leaving the > >>>zombie and confusing debugger. > >>When either gdb or the real parent gets to proc_reap() the process > >>wouldn't > >>get destroyed, it'll get caught by the following clause: > >> if (p->p_oppid&& (t = pfind(p->p_oppid)) != NULL) { > >> > >>and the real parent with get the child back into the children's list while > >>gdb will get it into the orphan list. The second time around when > >>proc_reap() is entered, p->p_oppid will be 0 and the process will get > >>really reaped. Does it make sense? And proc_reparent() attempts to keep > >>the > >>orphan list clean and not have the same entries and the list of siblings. > >Right, this is what I figured. But I asked about some further implication > >of this change: > > > >if real parent spuriosly calls wait4(2) on the child pid after the child > >exited, but before the debugger called the wait4(), then exactly the > >code you noted above will be run. This results in the child being fully > >returned to the original parent. > > > >Next, the wait4() call from debugger gets an error, and zombie will be > >kept around until parent calls wait4() for this pid once more. > > > >Am I missed something ? > > In this case the process will move from gdb's child list to gdb's orphan > list when the real parent does a wait4(). Next time around the wait loop in > gdb it'll be caught by the orphan's proc_reap().
I do not see how the next debugger loop could find this process at all, since the first wait4() call reparented it to the original parent.
pgpYsbd4CxqpH.pgp
Description: PGP signature