On 12/09/2016 07:30 AM, Bruno Haible wrote:

How can we go forward from here? I would propose a gnulib module that defines
a data structure that combines a 'struct stat' with the FILE_ID_INFO for native
Windows, and rebase the 'same-inode' module on it.

The other approach, to override mingw's 'struct stat' and stat/fstat/lstat()
functions, would imply a performance hit to all stat calls, even those that
don't want to access the st_ino field.
For grep's purposes a simple workaround is to have SAME_INODE always return 0 on MinGW, so I installed the attached patch into Gnulib. This isn't perfect (it means MinGW grep won't detect that the input and output are the same file), but it should be good enough to fix the glaring bugs and to conform to POSIX.

Although it might be helpful to have a fancier module that does the work of SAME_INODE but does it more accurately on MinGW, I'm not sure it's worth the hassle. A lot of code assumes that 'struct stat' suffices to identify files, and it would be a pain to clutter it with another struct of our own design that contains a 'struct stat' as a component. Even if we had another module like that, we'd need to keep SAME_INODE for the benefit of programs that cannot easily adopt the new struct.

It seems more plausible to override MinGW's struct stat and stat/etc. functions. To my mind it's OK to take a performance hit in the interest of portability. The performance hit would occur only on programs that need to deduce the equivalent of SAME_INODE.
From 7855a2e3ae4aacaa06a75a5e29930ac55ed0e9ae Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Fri, 9 Dec 2016 08:16:13 -0800
Subject: [PATCH] same-inode: port to MinGW
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Here st_ino is always 0, so change the definition of SAME_INODE so
that 1 means the two files are the same, 0 with st_ino != 0 means
they differ, and 0 with st_ino == 0 means we don’t know.  Problem
reported by Bruno Haible (Bug#25146).
* doc/posix-headers/sys_stat.texi (sys/stat.h): Update.
* lib/same-inode.h (SAME_INODE): Return 0 on MinGW.
---
 ChangeLog                       | 10 ++++++++++
 doc/posix-headers/sys_stat.texi |  4 ++--
 lib/same-inode.h                |  6 +++++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8103ebd..fd3e9d8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2016-12-09  Paul Eggert  <egg...@cs.ucla.edu>
+
+       same-inode: port to MinGW
+       Here st_ino is always 0, so change the definition of SAME_INODE so
+       that 1 means the two files are the same, 0 with st_ino != 0 means
+       they differ, and 0 with st_ino == 0 means we don’t know.  Problem
+       reported by Bruno Haible (Bug#25146).
+       * doc/posix-headers/sys_stat.texi (sys/stat.h): Update.
+       * lib/same-inode.h (SAME_INODE): Return 0 on MinGW.
+
 2016-12-04  Bruno Haible  <br...@clisp.org>
 
        javacomp: Support Java 7 and 8.
diff --git a/doc/posix-headers/sys_stat.texi b/doc/posix-headers/sys_stat.texi
index bd644f6..4c176aa 100644
--- a/doc/posix-headers/sys_stat.texi
+++ b/doc/posix-headers/sys_stat.texi
@@ -44,8 +44,8 @@ not a single value.
 @item
 To partially work around the previous two problems, you can test for
 nonzero @code{st_ino} and use the Gnulib @code{same-inode} module to
-compare nonzero values.  For example, @code{(a.st_ino && SAME_INODE
-(a, b))} is true if the @code{struct stat} values @code{a} and
+compare nonzero values.  For example, @code{SAME_INODE (a, b)}
+is true if the @code{struct stat} values @code{a} and
 @code{b} are known to represent the same file, @code{(a.st_ino &&
 !SAME_INODE (a, b))} is true if they are known to represent different
 files, and @code{!a.st_ino} is true if it is not known whether they
diff --git a/lib/same-inode.h b/lib/same-inode.h
index bf45635..c7a8fb5 100644
--- a/lib/same-inode.h
+++ b/lib/same-inode.h
@@ -1,4 +1,4 @@
-/* Determine whether two stat buffers refer to the same file.
+/* Determine whether two stat buffers are known to refer to the same file.
 
    Copyright (C) 2006, 2009-2016 Free Software Foundation, Inc.
 
@@ -24,6 +24,10 @@
      && (a).st_ino[1] == (b).st_ino[1] \
      && (a).st_ino[2] == (b).st_ino[2] \
      && (a).st_dev == (b).st_dev)
+# elif (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__
+/* On MinGW, struct stat lacks necessary info, so always return 0.
+   Callers can use !a.st_ino to deduce that the information is unknown.  */
+#  define SAME_INODE(a, b) 0
 # else
 #  define SAME_INODE(a, b)    \
     ((a).st_ino == (b).st_ino \
-- 
2.7.4

Reply via email to