In S+ on Unix-alikes we dealt with this issue by using fcntl(fd, F_SETFD, 1) to set the close-on-exec flag on a file descriptor as soon as we opened it. Bill Dunlap TIBCO Software wdunlap tibco.com
On Wed, Apr 19, 2017 at 8:40 PM, Winston Chang <winstoncha...@gmail.com> wrote: > In addition to the issue of a child process holding onto open files, the > child process can also manipulate a file descriptor in a way that affects > the parent process. For example, calling lseek() in the child process will > move the file offset in the parent process. > > Here is a set of commands that demonstrates it. They can be copied and > pasted in a terminal. What it does: > - Creates C program that seeks to the beginning of a file descriptor, and > compiles it to a program named "lseek". > - Creates a file with some text in it. > - Starts R. In R: > - Opens the text file and reads the first line. > - Runs lseek in a child process. > - Reads the rest of the lines. > > > echo "#include <unistd.h> > int main(void) { > lseek(3, 0, SEEK_SET); > }" > lseek.c > > gcc lseek.c -o lseek > > echo "line 1 > line 2 > line 3" > lines.txt > > R > f <- file('lines.txt', 'r') > cat(readLines(f, n = 1), sep = "\n") > system('./lseek') > cat(readLines(f), sep = "\n") > > > Here's what it outputs: >> f <- file('lines.txt', 'r') >> cat(readLines(f, n = 1), sep = "\n") > line 1 >> system('./lseek') >> cat(readLines(f), sep = "\n") > line 2 > line 3 > line 1 > line 2 > line 3 > > The child process has changed what the parent process reads from the file. > (I'm guessing that the reason readLines() prints out "line 2" and "line 3" > before starting over is because it has already buffered the whole file > before lseek is executed.) > > This is obviously a highly contrived case, but it illustrates what's > possible. The other issue I mentioned, with child processes holding open > files after the R process exits, is more likely to cause problems in the > real world. That's actually how I encountered this issue in the first > place: when restarting R inside of RStudio on a Mac, if there are any > extant child processes started by system(), they keep some files open, and > this causes RStudio to hang. (There's a fix in progress for RStudio for > this particular issue.) > > -Winston > > > > On Tue, Apr 18, 2017 at 3:20 PM, Winston Chang <winstoncha...@gmail.com> > wrote: > >> It seems that the system() and system2() functions don't close file >> descriptors between the fork() and exec() (on Unix platforms, of course). >> This means that the child processes inherit open files and socket >> connections. >> >> Running this (from a terminal) will result in the child process writing to >> a file that was opened by R: >> >> R >> f <- file('foo.txt', 'w') >> system('echo "abc" >&3') >> >> >> >> You can also see the open files if you run the following: >> f <- file('foo.txt', 'w') >> system2('sleep', '100', wait=F) >> >> And then in another terminal: >> lsof -c R -c sleep >> it will show that both the R and sleep processes have the file open: >> ... >> R 324 root 3w REG 0,48 0 4259 /foo.txt >> ... >> sleep 327 root 3w REG 0,48 0 4259 /foo.txt >> >> >> This behavior can cause problems if R spawns a child process that outlives >> the R process, but keeps open some resources. >> >> Would it be possible to add an option to close file descriptors for child >> processes? It would be nice if that were the default, but I suspect that >> making that change would break a lot of existing code. >> >> To take an example from the Python world, subprocess.Popen() has an >> option, close_fds, which closes all file descriptors except 0, 1, and 2. >> https://docs.python.org/2/library/subprocess.html#popen-constructor >> >> >> -Winston >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel