I figured it out. The problem was somewhere else completely - with python itself. I copied the patches from the FreeBSD port of python, and so far it is looking good.

On 04/21/2012 05:20 PM, Volker Braun wrote:
I guess the os.wait() throws an OSError with errno = EINTR because the
SIGALARM fires? If that is the case we should probably ignore it in
use_fork.py





On Saturday, April 21, 2012 3:36:44 PM UTC-4, Stephen Montgomery-Smith
wrote:

    I did some research, and I found that you are completely correct. I
    don't know why FreeBSD is misbehaving, because the os.wait command
    definitely triggers the OSError exception. signal.getsignal(SIGCHLD)
    shows that it is correctly set to signal.SIG_DFL. So I am mystified.

    Thank you for setting me straight.


    On 04/21/2012 08:58 AM, Volker Braun wrote:
     > I don't have a FreeBSD box to try this on, but note that children
     > shouldn't just disappear when they finish. They remain as zombies
     > (<defunct> in ps) until they are collected by os.wait()
     >
     >
     >
     > On Saturday, April 21, 2012 12:39:13 AM UTC-4, Stephen
    Montgomery-Smith
     > wrote:
     >
     > In parallel/use_fork.py, around line 120, there seems to be a race
     > problem. This seems to be a genuine problem with my FreeBSD build:
     > sage -t -force_lib "devel/sage/sage/parallel/decorate.py"
     > goes into a permanent loop.
     >
     > Suppose that timeout has been set, and the forked process is too
    slow.
     > So then SIGALRM is sent by the alarm before the "os.wait()[0]" is
     > completed. But now suppose that after the alarm has triggered, the
     > forked process manages to finish before getting to the
    "os.kill(pid,9)"
     > command. Then "len(workers)" is still positive, so the while loop
     > continues. But now the "os.wait()[0]" command has nothing to wait for
     > (since it should be waiting for "pid" to die), and everything ends up
     > stalling.
     >
     > Also if you do an os.kill command on a non-existent pid, you get an
     > OSError.
     >
     > I really don't understand the full logic of this code, otherwise I
     > would
     > give you guys a patch.
     >
     > --
     > To post to this group, send an email to
    sage-devel@googlegroups.com <mailto:sage-devel@googlegroups.com>
     > To unsubscribe from this group, send an email to
     > sage-devel+unsubscr...@googlegroups.com
    <mailto:sage-devel%2bunsubscr...@googlegroups.com>
     > For more options, visit this group at
     > http://groups.google.com/group/sage-devel
    <http://groups.google.com/group/sage-devel>
     > URL: http://www.sagemath.org

--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to