In article <8bec27dd-b1da-4aa3-81e8-9665db040...@n40g2000vbb.googlegroups.com> >'Nobody' (clearly a misnomer!) and Chris, thanks for your excellent >explanations about garbage collection. (Chris, I believe you must have >spent more time looking at the subprocess source and writing your >response than I have spent writing my code.)
Well, I just spent a lot of time looking at the code earlier this week as I was thinking about using it in a program that is required to be "highly reliable" (i.e., to never lose data, even if Things Go Wrong, like disks get full and sub-commands fail). (Depending on shell version, "set -o pipefail" can allow "cheating" here, i.e., with subprocess, using shell=True and commands that have the form "a | b": $ (exit 0) | (exit 2) | (exit 0) $ echo $? 0 $ set -o pipefail $ (exit 0) | (exit 2) | (exit 0) $ echo $? 2 but -o pipefail is not POSIX and I am not sure I can count on it.) >GC is clearly at the heart of my lack of understanding on this >point. It sounds like, from what Chris said, that *any* file >descriptor would be closed when GC occurs if it is no longer >referenced, subprocess-related or not. Yes -- but, as noted elsethread, "delayed" failures from events like "disk is full, can't write last bits of data" become problematic. >It sounds to me that, although my code might be safe now as is, I >probably need to do an explicit p.stdXXX.close() myself for any pipes >which I open via Popen() as soon as I am done with them. Or, use the p.communicate() function, which contains the explicit close. Note that if you are using a unidirectional pipe and do your own I/O -- as in your example -- calling p.communicate() will just do the one attempt to read from the pipe and then close it, so you can ignore the result: import subprocess p = subprocess.Popen(["cat", "/etc/motd"], stdout=subprocess.PIPE) for line in p.stdout: print line.rstrip() p.communicate() The last call returns ('', None) (note: not ('', '') as I suggested earlier, I actually typed this one in on the command line). Run python with strace and you can observe the close call happen -- this is the [edited to fit] output after entering the p.communicate() line: read(0, "\r", 1) = 1 write(1, "\n", 1 ) = 1 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost ...}) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 [I push "enter", readline echos a newline and does tty ioctl()s] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigaction(SIGWINCH, {SIG_DFL}, {0xb759ed10, [], SA_RESTART}, 8) = 0 time(NULL) = 1287075471 [no idea what these are really for, but the signal manipulation appears to be readline()] fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 _llseek(3, 0, 0xbf80d490, SEEK_CUR) = -1 ESPIPE (Illegal seek) read(3, "", 8192) = 0 close(3) = 0 [fd 3 is the pipe reading from "cat /etc/motd" -- no idea what the fstat64() and _llseek() are for here, but the read() and close() are from the communicate() function] waitpid(13775, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 13775 [this is from p.wait()] write(1, "(\'\', None)\n", 11('', None) ) = 11 [this is the result being printed, and the rest is presumably readline() again] ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 ioctl(0, TIOCGWINSZ, {ws_row=44, ws_col=80, ...}) = 0 ioctl(0, TIOCSWINSZ, {ws_row=44, ws_col=80, ...}) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost ...}) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigaction(SIGWINCH, {0xb759ed10, [], SA_RESTART}, {SIG_DFL}, 8) = 0 write(1, ">>> ", 4>>> ) = 4 select(1, [0], NULL, NULL, NULL >On a related point here, I have one case where I need to replace the >shell construct > > externalprog <somefile >otherfile > >I suppose I could just use os.system() here but I'd rather keep the >Unix shell completely out of the picture (which is why I am moving >things to Python to begin with!), so I'm just doing a simple open() on >somefile and otherfile and then passing those file handles into >Popen() for stdin and stdout. I am already closing those open()ed file >handles after the child completes, but I suppose that I probably >should also explicitly close Popen's p.stdin and p.stdout, too. (I'm >guessing they might be dup()ed from the original file handles?) There is no dup()ing going on so this is not necessary, but again, using the communicate function will close them for you. In this case, though, I am not entirely sure subprocess is the right hammer -- it mostly will give you portablility to Windows (well, plus the magic for preexec_fn and reporting exec failure). Once again, peeking at the source is the trick :-) ... the arguments you provide for stdin, stdout, and stderr are used thus: if stdin is None: pass elif stdin == PIPE: p2cread, p2cwrite = os.pipe() elif isinstance(stdin, int): p2cread = stdin else: # Assuming file-like object p2cread = stdin.fileno() (this is repeated for stdout and stderr) and the resulting integer file descriptors (or None if not applicable) are passed to os.fdopen() on the parent side. (On the child side, the code does the usual shell-like dance to move the appropriate descriptors to 0 through 2.) -- In-Real-Life: Chris Torek, Wind River Systems Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 email: gmail (figure it out) http://web.torek.net/torek/index.html
-- http://mail.python.org/mailman/listinfo/python-list