Hello. When running under MinGW on Windows, there seems to be a bug in Gnulib's fdopendir implementation. Gnulib's fdopendir closes the file descriptor that was passed to it as an argument. It then tries to reopen the directory using the same file descriptor, but that doesn't seem to work in MinGW, and so the file descriptor remains closed after fdopendir returns.
Here is an example of some code that exhibits the bug: int fd = open("emptydir", O_RDONLY | O_DIRECTORY | O_NOCTTY | O_NONBLOCK); printf("dup(fd) = %d\n", dup(fd)); fdopendir(fd); printf("dup(fd) = %d\n", dup(fd)); Under MinGW, the second call to dup will fail and return -1 because fdopendir closes the file descriptor. Ultimately, I am trying to compile Grep 2.21 in Windows with MinGW so that I can have a good tool for searching files. (I can't find any recent version of Grep compiled for Windows which doesn't have extra dependencies.) When I compile Grep, the version I compiled does not work properly with the recursive options (-r and -R). I wrote about this on the Grep bug tracker here: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=16444#18 Grep uses the FTS implementation from Gnulib. Gnulib's FTS implementation uses fdopendir. After calling fdopendir, FTS tries to use the same file descriptor for other purposes, but since fdopendir already closed it, grep ends up printing a fatal error that says "Bad file descriptor". (This was "Bug A" in my post on the grep bug tracker.) == How I reproduced the bug == I reproduced the bug by making a very simple C project that just uses Gnulib. The code does some operations on an empty directory named "emptydir", which is assumed to be inside the current working directory, and prints the results. You can get the source here: http://www.davidegrayson.com/keep/gnulib_150315/test_gnulib-1.0.0-src.tar.gz I used autoconf to generate a source distribution, which you can get here: http://www.davidegrayson.com/keep/gnulib_150315/test_gnulib-1.0.0.tar.gz The main source file (main.c) is attached to this message, and you can also read it here: http://www.davidegrayson.com/keep/gnulib_150315/main.c When I run the code in Arch Linux, it gives the expected output: $ ./src/testgnulib dup(fd) = 4 dup(fd) = 5 emptydir emptydir When I run the code in Windows 8.1 64-bit, it gives bad results. I compiled the code with MinGW (gcc.exe (i686-posix-dwarf-rev1, Built by MinGW-W64 project) 4.9.2) and a patched version of make (3.82-pololu2 from https://github.com/pololu/make). Most of the other utilities on my PATH in Windows come from Msysgit. I ran "./configure" in Git Bash. Then I invoked "make" with the following command, because CreateProcess does not recognize paths like /c/git/bin/mkdir: make MKDIR_P='mkdir -p' SED=sed When I run the resulting executable in Windows, it gives this output: $ ./src/testgnulib.exe dup(fd) = 4 dup(fd) = -1 emptydir emptydir: error: Bad file descriptor This shows that the file descriptor was duplicable before fdopendir was called, but not duplicable afterwards (because fdopendir closed the descriptor). It also shows the resulting problem in FTS, where FTS returns an error when it tries to traverse the empty directory. If I comment out the "close (fd);" line in fdopendir.c, then the program behaves as expected under MinGW, returning the same output that it did in Arch Linux. == Discussion of the bug == The documentation for fdeopendir that I was reading can be found here: http://pubs.opengroup.org/onlinepubs/9699919799/functions/fdopendir.html https://www.gnu.org/software/gnulib/manual/html_node/fdopendir.html The prototype of fdeopndir is: DIR *fdopendir(int fd); It takes a file descriptor representing an opened directory, and creates a DIR pointer representing that directory, to be used by the dirent system. Neither the POSIX documentation nor the Gnulib documentation for fdopendir mentions that the file descriptor might get closed by fdopendir, so it seems like a bug for that to happen. The Gnulib documentation says that its fdopendir "does not guarantee that 'dirfd(fdopendi r(n))==n'". And indeed, when I call dirfd(fdopendir(fd)) under MinGW it returns -1. The POSIX documentation for fdopendir says that when closedir() is called on the returned pointer, the file descriptor (fd) shall be closed. That means that somehow, some information about fd has to be associated with the returned DIR pointer. I looked at the source code of Gnulib's fdopendir and tried to figure out what is going on, but I can't say I totally understand it. Gnulib's fdopendir first closes the passed file descriptor, and then it uses a non-thread-safe strategy involving duplication and recursion in order to reopen the directory in a way so that it happens to reuse the same file descriptor number. This works on Linux but apparently does not work on MinGW. This is speculation, but I suspect that this dup/recursive strategy in fdopendir is there in order to make sure that the returned DIR pointer is using the right file descriptor, so that closedir will end up closing the file descriptor as POSIX requires. But on MinGW+Gnulib, dirfd(DIR *) returns -1, so it makes me think that DIR pointers don't actually contain file descriptors or close them when closedir is called; they might use some kind of native Win32 HANDLE instead. == Possible solutions == I would appreciate any tips on how to fix this problem. Ideally we would fix Gnulib, but if that is not going to happen then I would still like to find a good workaround that I can apply to the software that I build. My workaround of commenting out the "close (fd);" line in fdopendir.c is not a good long term solution because it will leave unused directory handles open in any program that assumes that "closedir(fdopendir(fd))" actually closes fd, as specified by POSIX. To make fdopendir be POSIX compliant on MinGW, it seems like you would want to somehow associate the fd with the DIR pointer even if the fd isn't used by typical dirent functions. Then when closedir is called, you would want to retrieve that fd and close it, as a side effect. Alternatively, if full POSIX compliance is too hard, we can instead say that any Gnulib program that calls fdopendir should avoid using the supplied file descriptor afterwards since it might have been closed. We would need to fix FTS and any other code that does that. I suspect this would be a radical change that breaks lots of programs though. I don't fully understand what is going on here and I might have missed something. --David Grayson
#include <stdio.h> #include "config.h" #include "fts_.h" #include "progname.h" #include "error.h" #include "errno.h" #include "string.h" #include "fcntl.h" #include "unistd.h" int main() { set_program_name("testgnulib"); // This code demonstrates a bug in fdopendir on Mingw. { int fd = open("emptydir", O_RDONLY | O_DIRECTORY | O_NOCTTY | O_NONBLOCK); if (fd < 0) { fprintf(stderr, "open returned %d\n", fd); return 1; } printf("dup(fd) = %d\n", dup(fd)); fdopendir(fd); printf("dup(fd) = %d\n", dup(fd)); } // This code shows how the bug affects fts. { char filename[] = "emptydir"; char * fts_args[] = { filename, NULL }; int fts_opts = FTS_PHYSICAL; FTS * fts = fts_open(fts_args, fts_opts, NULL); if (!fts) { fprintf(stderr, "fts_open returned NULL\n"); return 2; } while(1) { FTSENT * ent = fts_read(fts); if (!ent) { break; } if (ent->fts_info == FTS_ERR) { printf("%s: error: %s\n", ent->fts_path, strerror(ent->fts_errno)); } else { printf("%s\n", ent->fts_path); } } } return 0; }