I don't have a FreeBSD box to try this on, but note that children shouldn't 
just disappear when they finish. They remain as zombies (<defunct> in ps) 
until they are collected by os.wait()



On Saturday, April 21, 2012 12:39:13 AM UTC-4, Stephen Montgomery-Smith 
wrote:
>
> In parallel/use_fork.py, around line 120, there seems to be a race 
> problem.  This seems to be a genuine problem with my FreeBSD build: 
> sage -t  -force_lib "devel/sage/sage/parallel/decorate.py" 
> goes into a permanent loop. 
>
> Suppose that timeout has been set, and the forked process is too slow. 
> So then SIGALRM is sent by the alarm before the "os.wait()[0]" is 
> completed.  But now suppose that after the alarm has triggered, the 
> forked process manages to finish before getting to the "os.kill(pid,9)" 
> command.  Then "len(workers)" is still positive, so the while loop 
> continues.  But now the "os.wait()[0]" command has nothing to wait for 
> (since it should be waiting for "pid" to die), and everything ends up 
> stalling. 
>
> Also if you do an os.kill command on a non-existent pid, you get an 
> OSError. 
>
> I really don't understand the full logic of this code, otherwise I would 
> give you guys a patch. 
>

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to