Re: [PATCH 5/6] fts: three levels of leaf optimization
On Tuesday, April 10, 2018 9:54:26 PM CEST Bernhard Voelker wrote: > Hi Kamil, > > On 04/06/2018 09:29 AM, Kamil Dudka wrote: > > On Friday, April 6, 2018 8:08:55 AM CEST Bernhard Voelker wrote: > >> On 04/05/2018 02:44 PM, Kamil Dudka wrote: > >>> The same 'find' command with > >>> the -noleaf option does not abort: > >>> > >>> https://bugzilla.redhat.com/1558249#c20 > >> > >> This bug report is non-public. Can you make it available, or > >> post a reproducer, please? > > > > I did not know it was private. It should be publicly available now. > > Thanks. > > Although current upstream (v4.6.0-172-gb94be7a0) has already the > non-CIFS-aware FTS change from gnulib, I couldn't reproduce the crash with > a quick test here. I'll try again tomorrow. > > Did you try with a build from upstream as well? > > Thanks & have a nice day, > Berny I have not tried to reproduce the crash myself. I was just analyzing the strace outputs attached to that bug. I hope I will have some time next week to debug it deeper (in case nobody finds the cause till then). Kamil
Re: [PATCH 5/6] fts: three levels of leaf optimization
On Thu, Apr 5, 2018 at 4:49 PM, Paul Eggert wrote: > > On 04/05/2018 05:44 AM, Kamil Dudka wrote: >> >> Does this change (intentionally?) enable leaf optimization for CIFS? > > > No, I wasn't thinking about CIFS when I wrote that. Thanks for reporting it. I installed the attached to try to fix this. I have not tested it with CIFS. > Where is the abort actually happening? If the abort is in fts itself, then this patch is a reasonable fix (though generally I prefer not to allow aborts in library code). But if the abort is in find itself (i.e. not in the gnulib code at all) then we should fix find (either as well, or instead). James.
Re: [PATCH 5/6] fts: three levels of leaf optimization
On Wednesday, April 11, 2018 3:38:40 PM CEST James Youngman wrote: > On Thu, Apr 5, 2018 at 4:49 PM, Paul Eggert wrote: > > On 04/05/2018 05:44 AM, Kamil Dudka wrote: > >> Does this change (intentionally?) enable leaf optimization for CIFS? > > > > No, I wasn't thinking about CIFS when I wrote that. Thanks for reporting > > it. I installed the attached to try to fix this. I have not tested it with > CIFS. > > > Where is the abort actually happening? If the abort is in fts itself, > then this patch is a reasonable fix (though generally I prefer not to allow > aborts in library code). But if the abort is in find itself (i.e. not in > the gnulib code at all) then we should fix find (either as well, or > instead). > > James. The abort() is in gnulib/fts code. It is triggered on this line: http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/fts.c;h=3814e58f#l1383 As far as I know, there is currently no fix available for that. Kamil
Gnulib on Windows (native / mingw32) / VMS / etc.
Hi all. I spent a bit of time this weekend looking into what it would take to update GNU make to use the glob/fnmatch implementations from the current gnulib, rather than the ancient (and buggy) implementation that GNU make has embedded, more or less unchanged, for at least 25 years. This, unfortunately, turns out to be quite a daunting task. Gnulib glob/fnmatch rely on a huge number of other gnulib packages so instead of the simple two-file implementation GNU make has now it expands to a lib directory containing 20+ files (I didn't actually count but it's quite a lot). This matters for a few reasons: first, GNU make provides a bootstrapping script that will let you compile make on systems which don't already have make... that means that I need to be able to build all these extra files without the assistance of automake (I do run configure). I'm not sure how easy that will be. More worrying, GNU make is compile-able on Windows and does not require Cygwin: it can be built using native MSVC and/or MingW32. For bootstrapping GNU make provides a "build_w32.bat" file that compiles make using either gcc or MSVC. In this setup I have a hard-coded config.h for Windows that I install since I don't run configure. Making this work with the many packages added to lib seems complex. Finally, GNU make can also be compiled on other less common architectures, such as VMS and these ports are quite active. Is there any information about all these different glob/fnmatch requirements in gnulib, insofar as portability to other environments like VMS? I would really like to use native gnulib implementations but the overhead involved has me considering whether it would be feasible to try to extract the latest implementation and somehow fencing it in to avoid all these extra dependencies. Any comments anyone has would be welcome, cheers!
Re: [PATCH 5/6] fts: three levels of leaf optimization
On 04/11/2018 07:03 AM, Kamil Dudka wrote: As far as I know, there is currently no fix available for that. I looked into this and found some bugs in relevant code (bugs that I introduced last summer; sorry!). I installed the attached into Gnulib to fix what I found. Please give it a try and see whether it fixes your problem. >From be9b95a8f5a3ddbb722099df10d19c74a341fff9 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 11 Apr 2018 12:50:35 -0700 Subject: [PATCH] fts: fix bug in find across filesystems This fixes a bug I introduced last summer. Problem reported by Kamil Dudka in: https://lists.gnu.org/r/bug-gnulib/2018-04/msg00033.html * lib/fts.c (filesystem_type, dirent_inode_sort_may_be_useful) (leaf_optimization): New arg for file descriptor. All callers changed. (fts_build): Check for whether inodes should be sorted before closing the directory. --- ChangeLog | 12 lib/fts.c | 55 +++ 2 files changed, 43 insertions(+), 24 deletions(-) diff --git a/ChangeLog b/ChangeLog index ea30952f5..3702f75c1 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,15 @@ +2018-04-11 Paul Eggert + + fts: fix bug in find across filesystems + This fixes a bug I introduced last summer. + Problem reported by Kamil Dudka in: + https://lists.gnu.org/r/bug-gnulib/2018-04/msg00033.html + * lib/fts.c (filesystem_type, dirent_inode_sort_may_be_useful) + (leaf_optimization): + New arg for file descriptor. All callers changed. + (fts_build): Check for whether inodes should be sorted + before closing the directory. + 2018-04-07 Bruno Haible unicase/u*-context: Fix link errors with libunistring <= 0.9.9. diff --git a/lib/fts.c b/lib/fts.c index 3814e58fc..a26c6824a 100644 --- a/lib/fts.c +++ b/lib/fts.c @@ -726,11 +726,12 @@ dev_type_compare (void const *x, void const *y) return ax->st_dev == ay->st_dev; } -/* Return the file system type of P, or 0 if not known. +/* Return the file system type of P with file descriptor FD, or 0 if not known. + If FD is negative, P's file descriptor is unavailable. Try to cache known values. */ static fsword -filesystem_type (FTSENT const *p) +filesystem_type (FTSENT const *p, int fd) { FTS *sp = p->fts_fts; Hash_table *h = sp->fts_leaf_optimization_works_ht; @@ -756,7 +757,7 @@ filesystem_type (FTSENT const *p) } /* Look-up failed. Query directly and cache the result. */ - if (fstatfs (p->fts_fts->fts_cwd_fd, &fs_buf) != 0) + if (fd < 0 || fstatfs (fd, &fs_buf) != 0) return 0; if (h) @@ -778,12 +779,12 @@ filesystem_type (FTSENT const *p) return fs_buf.f_type; } -/* Return false if it is easy to determine the file system type of the - directory P, and sorting dirents on inode numbers is known not to - improve traversal performance with that type of file system. - Otherwise, return true. */ +/* Return true if sorting dirents on inode numbers is known to improve + traversal performance for the directory P with descriptor DIR_FD. + Return false otherwise. When in doubt, return true. + DIR_FD is negative if unavailable. */ static bool -dirent_inode_sort_may_be_useful (FTSENT const *p) +dirent_inode_sort_may_be_useful (FTSENT const *p, int dir_fd) { /* Skip the sort only if we can determine efficiently that skipping it is the right thing to do. @@ -791,7 +792,7 @@ dirent_inode_sort_may_be_useful (FTSENT const *p) while the cost of *not* performing it can be O(N^2) with a very large constant. */ - switch (filesystem_type (p)) + switch (filesystem_type (p, dir_fd)) { case S_MAGIC_CIFS: case S_MAGIC_NFS: @@ -805,16 +806,17 @@ dirent_inode_sort_may_be_useful (FTSENT const *p) } } -/* Given an FTS entry P for a directory D, +/* Given an FTS entry P for a directory with descriptor DIR_FD, return true if it is both useful and valid to apply leaf optimization. The optimization is useful only for file systems that lack usable dirent.d_type info. The optimization is valid if an st_nlink value of at least MIN_DIR_NLINK is an upper bound on the number of - subdirectories of D, counting "." and ".." as subdirectories. */ + subdirectories of D, counting "." and ".." as subdirectories. + DIR_FD is negative if unavailable. */ static enum leaf_optimization -leaf_optimization (FTSENT const *p) +leaf_optimization (FTSENT const *p, int dir_fd) { - switch (filesystem_type (p)) + switch (filesystem_type (p, dir_fd)) { /* List here the file system types that may lack usable dirent.d_type info, yet for which the optimization does apply. */ @@ -851,12 +853,13 @@ leaf_optimization (FTSENT const *p) #else static bool -dirent_inode_sort_may_be_useful (FTSENT const *p _GL_UNUSED) +dirent_inode_sort_may_be_useful (FTSENT const *p _GL_UNUSED, + int dir_fd _GL_UNUSED) { return true; } static enum leaf_optimization -leaf_optimization (F
Re: Gnulib on Windows (native / mingw32) / VMS / etc.
Except for VMS, GNU Emacs is in a boat similar to GNU Make. It uses Gnulib (but not Automake), it is buildable on MS-Windows and does not need Cygwin, it has its own configuration batch script for MS-Windows (and MS-DOS!). So I would look to Emacs for inspiration here. I'm mostly responsible for the Emacs Gnulib usage and can help and/or answer questions in this area. Gnulib does have some VMS support but it's on an as-needed basis, and quite possibly you'd need to hack it in (this was most recently done for mktime in September). As it happens, in August we removed some VMS code from the glob module, since nobody had used it for years and glibc was removing it. I suppose we could put it back in. Alternatively, GNU Make could continue to use the old glob.c for VMS, and use Gnulib glob.c for everything else. (Not ideal, I know.)
Re: [PATCH 5/6] fts: three levels of leaf optimization
On 04/11/2018 09:53 PM, Paul Eggert wrote: > @@ -1589,6 +1593,15 @@ mem1: saved_errno = errno; > tail->fts_link = p; > tail = p; > } > + > +/* If there are many entries, no sorting function has been > + specified, and this file system is of a type that may be > + slow with a large number of entries, arrange to sort the > + directory entries on increasing inode numbers. */ > +if (nitems == _FTS_INODE_SORT_DIR_ENTRIES_THRESHOLD __^^ > +&& !sp->fts_compar) > + sort_by_inode = dirent_inode_sort_may_be_useful (cur, > dir_fd); > + This looks wrong: didn't you mean the '>' operator? Thanks for digging into this issue. Have a nice day, Berny
Re: [PATCH 5/6] fts: three levels of leaf optimization
On 04/11/2018 02:58 PM, Bernhard Voelker wrote: didn't you mean the '>' operator? No, '==' should be right. I added the attached to try to explain this better. Thanks for the nudge. >From 09ca545d5c0af8760ce27b1e9cbe300e64f73746 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 11 Apr 2018 16:27:03 -0700 Subject: [PATCH] fts: add comment * lib/fts.c (fts_build): Explain why ==, not >. See remark by Bernhard Voelker in: https://lists.gnu.org/r/bug-gnulib/2018-04/msg00041.html --- ChangeLog | 5 + lib/fts.c | 6 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 3702f75c1..f5752a217 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,10 @@ 2018-04-11 Paul Eggert + fts: add comment + * lib/fts.c (fts_build): Explain why ==, not >. + See remark by Bernhard Voelker in: + https://lists.gnu.org/r/bug-gnulib/2018-04/msg00041.html + fts: fix bug in find across filesystems This fixes a bug I introduced last summer. Problem reported by Kamil Dudka in: diff --git a/lib/fts.c b/lib/fts.c index a26c6824a..d54351059 100644 --- a/lib/fts.c +++ b/lib/fts.c @@ -1597,7 +1597,11 @@ mem1: saved_errno = errno; /* If there are many entries, no sorting function has been specified, and this file system is of a type that may be slow with a large number of entries, arrange to sort the - directory entries on increasing inode numbers. */ + directory entries on increasing inode numbers. + + The NITEMS comparison uses ==, not >, because the test + needs to be tried at most once once, and NITEMS will exceed + the threshold after it is incremented below. */ if (nitems == _FTS_INODE_SORT_DIR_ENTRIES_THRESHOLD && !sp->fts_compar) sort_by_inode = dirent_inode_sort_may_be_useful (cur, dir_fd); -- 2.14.3
Re: [PATCH 5/6] fts: three levels of leaf optimization
On 04/12/2018 01:28 AM, Paul Eggert wrote: > No, '==' should be right. I added the attached to try to explain this > better. Thanks for the nudge. ah, indeed, running dirent_inode_sort_may_be_useful only once in the loop may/will save quite some time, thanks. Have a nice day, Berny