On Wed, Nov 30, 2016 at 4:12 PM, duncan smith <duncan@invalid.invalid> wrote: > On 30/11/16 17:57, Chris Angelico wrote: >> On Thu, Dec 1, 2016 at 4:34 AM, duncan smith <duncan@invalid.invalid> wrote: >>> >>> def _execute(command): >>> # shell=True security hazard? >>> p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, >>> stdout=subprocess.PIPE, >>> stderr=subprocess.STDOUT, >>> close_fds=True) >>> output = p.stdout.read() >>> p.stdin.close() >>> p.stdout.close() >>> #p.communicate() >>> if output: >>> print output >> >> Do you ever wait() these processes? If not, you might be leaving a >> whole lot of zombies behind, which will eventually exhaust your >> process table. >> >> ChrisA >> > > No. I've just called this several thousand times (via calls from a > higher level function) and had no apparent problem. Top reports no > zombie tasks, and memory consumption and the number of sleeping tasks > seem to be reasonably stable. I'll try running the code that generated > the error to see if I can coerce it into failing again. OK, no error > this time. Great, an intermittent bug that's hard to reproduce ;-). At > the end of the day I just want to invoke dot to produce an image file > (perhaps many times). Is there perhaps a simpler and more reliable way > to do this? Or do I just add the p.wait()? (The commented out > p.communicate() is there from a previous, failed attempt to fix this - > as, I think, are the shell=True and close_fds=True.) Cheers.
That would appear to rule out the most common issues I would think of. That said, are these calls being done in a tight loop (the full call-stack implies it might be a physics simulation)? Are you doing any threading (either in Python or when making the calls to Python - using a bash command to start new processes without waiting counts)? Is there any exception handling at a higher level that might be continuing past the error and sometimes allowing a zombie process to stay? If you are making a bunch of calls in a tight loop, that could be your issue, especially as you are not waiting on the process (though the communicate does so implicitly, and thus should have fixed the issue). This could be intermittent if the processes sometimes complete quickly, and other times are delayed. In these cases, a ton of the dot processes (and shell with shell=true) could be created before any finish, thus causing massive usage. Some of the processes may be hanging, rather than outright crashing, and thus leaking some resources. BTW, the docstring in to_image implies that the shell=True is not an attempted fix for this - the example 'unflatten -l 3 | dot' is explicitly suggesting the usage of shell=True. -- https://mail.python.org/mailman/listinfo/python-list