[sage-devel] Re: zombie problem solved (hoepfully)

William Stein Wed, 08 Nov 2006 09:16:10 -0800

On Wed, 08 Nov 2006 07:27:39 -0800, Gonzalo Tornaria  
<[EMAIL PROTECTED]> wrote:


>
> On Wed, Nov 08, 2006 at 02:23:18AM -0800, William Stein wrote:
>> Good question.  Longterm there are a couple of issues:
>>
>>   (1) How do you tell the monitor about new processes that get spawned?
>>       You could put that info in a temp file, but that feels a little
>>       clunky.
>
> You can read from a pipe or some other form of IPC (e.g. unix socket)

Yes, that could work.

>>   (3) I want to continue to support the spawned processes running on  
>> other
>>       computers (or as different users) via ssh.  With a separate  
>> monitor
>>       for each spawned process this is possible (though two ssh sessions
>>       would be needed).  This isn't possible if there is only one  
>> monitor,
>>       since it can only run on one computer.
>
> Just run one monitor for each host.

OK, but then what about question 1 again?  In particular, telling a monitor
over the network about new processes would be complicated.

>>   (4) For reasons I don't understand, the slave process doesn't really  
>> die
>>       until the monitor exits.  If in the monitor script instead of  
>> doing
>>       a sys.exit(0) after the kill, I continue the monitor running,  
>> then the
>>       process the monitor is watching doesn't terminate as it should.   
>> This
>>       is on OS X Intel, and is rather odd, but isn't an issue with the
>> 1-monitor
>>       per process model.
>
> See wait(2) and waitpid(2) (and also wait3 and wait4 for resource
> information).
>
> Essentially, when a process dies, it stays in "zombie" state so that
> one can (a) get exit status (b) get resource usage information (c)
> dump core IIRC, etc.
>
> The usual trick to spawn e.g. a daemon is to fork / setsid(2) / fork,
> run the process in question as a grandchild, and let the child die;
> because of the setsid(2) call, the process is not adopted by its
> grandparent, but by the init process, which is supposed to clean up on
> exit of any process.  [ See also setsid(8) ]
>
> Since we are talking about a monitor, the sensible thing is that the
> monitor waits for all its subprocesses.

One point that might not have been clear from my previous posting
is that the monitor does not have any subprocesses.  The gap/gp/magma,
etc., process that it monitors is a sibling rather than a subprocess.

> In addition, the monitor can get information about resource usage,
> which could be interesting.

Yes.

>>   (5) The overhead is minimal -- it really is only 2MB to run a minimal
>>       Python process.
>
> However small, it's still O(n).

Yes but for a typically running SAGE program n is about 3-4, at most.
There's no reason in SAGE to launch numerous subprocesses.

> BTW, isn't it better to kill -15 first, wait some time, then kill -9
> (give a chance to cleanup in case there is a SIGTERM handler)

Yes.  Good point.

Many thanks for your email.

Anyway, this process monitor thing is a completely general purpose
unix tool.  It really a priori has nothing to do with SAGE.  Most
of the suggestions on the list are to turn it from what I wrote
into a generic daemon.  I wonder -- has such a generic daemon for
process monitoring *already* been written and I just don't know
about it?

William


--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: zombie problem solved (hoepfully)

Reply via email to