On 01/12/16 00:46, Chris Kaynor wrote: > On Wed, Nov 30, 2016 at 4:12 PM, duncan smith <duncan@invalid.invalid> wrote: >> On 30/11/16 17:57, Chris Angelico wrote: >>> On Thu, Dec 1, 2016 at 4:34 AM, duncan smith <duncan@invalid.invalid> wrote: >>>> >>>> def _execute(command): >>>> # shell=True security hazard? >>>> p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, >>>> stdout=subprocess.PIPE, >>>> stderr=subprocess.STDOUT, >>>> close_fds=True) >>>> output = p.stdout.read() >>>> p.stdin.close() >>>> p.stdout.close() >>>> #p.communicate() >>>> if output: >>>> print output >>> >>> Do you ever wait() these processes? If not, you might be leaving a >>> whole lot of zombies behind, which will eventually exhaust your >>> process table. >>> >>> ChrisA >>> >> >> No. I've just called this several thousand times (via calls from a >> higher level function) and had no apparent problem. Top reports no >> zombie tasks, and memory consumption and the number of sleeping tasks >> seem to be reasonably stable. I'll try running the code that generated >> the error to see if I can coerce it into failing again. OK, no error >> this time. Great, an intermittent bug that's hard to reproduce ;-). At >> the end of the day I just want to invoke dot to produce an image file >> (perhaps many times). Is there perhaps a simpler and more reliable way >> to do this? Or do I just add the p.wait()? (The commented out >> p.communicate() is there from a previous, failed attempt to fix this - >> as, I think, are the shell=True and close_fds=True.) Cheers. > > That would appear to rule out the most common issues I would think of. > > That said, are these calls being done in a tight loop (the full > call-stack implies it might be a physics simulation)? Are you doing > any threading (either in Python or when making the calls to Python - > using a bash command to start new processes without waiting counts)? > Is there any exception handling at a higher level that might be > continuing past the error and sometimes allowing a zombie process to > stay? >
In this case the calls *are* in a loop (record linkage using an expectation maximization algorithm). > If you are making a bunch of calls in a tight loop, that could be your > issue, especially as you are not waiting on the process (though the > communicate does so implicitly, and thus should have fixed the issue). > This could be intermittent if the processes sometimes complete > quickly, and other times are delayed. In these cases, a ton of the dot > processes (and shell with shell=true) could be created before any > finish, thus causing massive usage. Some of the processes may be > hanging, rather than outright crashing, and thus leaking some > resources. > I'll try the p.communicate thing again. The last time I tried it I might have already got myself into a situation where launching more subprocesses was bound to fail. I'll edit the code, launch IDLE again and see if it still happens. > BTW, the docstring in to_image implies that the shell=True is not an > attempted fix for this - the example 'unflatten -l 3 | dot' is > explicitly suggesting the usage of shell=True. > OK. As you can see, I don't really understand what's happening under the hood :-). Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list