On Wed, Nov 08, 2006 at 02:23:18AM -0800, William Stein wrote: > Good question. Longterm there are a couple of issues: > > (1) How do you tell the monitor about new processes that get spawned? > You could put that info in a temp file, but that feels a little > clunky.
You can read from a pipe or some other form of IPC (e.g. unix socket) > (3) I want to continue to support the spawned processes running on other > computers (or as different users) via ssh. With a separate monitor > for each spawned process this is possible (though two ssh sessions > would be needed). This isn't possible if there is only one monitor, > since it can only run on one computer. Just run one monitor for each host. > (4) For reasons I don't understand, the slave process doesn't really die > until the monitor exits. If in the monitor script instead of doing > a sys.exit(0) after the kill, I continue the monitor running, then the > process the monitor is watching doesn't terminate as it should. This > is on OS X Intel, and is rather odd, but isn't an issue with the > 1-monitor > per process model. See wait(2) and waitpid(2) (and also wait3 and wait4 for resource information). Essentially, when a process dies, it stays in "zombie" state so that one can (a) get exit status (b) get resource usage information (c) dump core IIRC, etc. The usual trick to spawn e.g. a daemon is to fork / setsid(2) / fork, run the process in question as a grandchild, and let the child die; because of the setsid(2) call, the process is not adopted by its grandparent, but by the init process, which is supposed to clean up on exit of any process. [ See also setsid(8) ] Since we are talking about a monitor, the sensible thing is that the monitor waits for all its subprocesses. In addition, the monitor can get information about resource usage, which could be interesting. > (5) The overhead is minimal -- it really is only 2MB to run a minimal > Python process. However small, it's still O(n). BTW, isn't it better to kill -15 first, wait some time, then kill -9 (give a chance to cleanup in case there is a SIGTERM handler) Best, Gonzalo --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/ -~----------~----~----~----~------~----~------~--~---