Hi Bruno,
You said:
>> #if ! defined ADDITIONAL_DEFAULT_SHELLS && defined __MSDOS__
>> # define ADDITIONAL_DEFAULT_SHELLS \
>> "c:/dos/command.com", "c:/windows/command.com", "c:/command.com",
>
> Inside '#if defined __MSDOS__' or '#if defined _WIN32' this is OK.
> I was worried there was explicit code to accept such prefixes on POSIX
> compatible platforms. Fortunately not.
Sorry for the confusion. I didn't mean to suggest adding drive
prefixes on non-Windows systems hahaha. That would be an "interesting"
choice.
>> I've attached a patch following this method.
>
> I see a method 'next_shell' that is only called in a single place. Would
> it make sense to inline it (and move its function comments into the
> body of the caller function)?
Good point. I thought it was a bit strange to pass the file stream and
buffer as arguments when they are static global variables in that
file.
I've attached a patch moving it to the end of getusershell. Makes
things easier to understand IMO. I haven't pushed it yet so let me
know what you think.
> The main change is to call getline() instead of parsing the file
> character-by-character. I guess that the motivation is that the
> behaviour of trimming leading whitespace and trimming trailing whitespace
> before '#' would lead to convoluted code with the previous approach?
Pretty much, yes. I tried to do that originally but I don't think you
would have liked the patch. :)
At the start of 'readname' there is a loop to skip spaces:
while ((c == getc (stream)) != EOF
&& isspace ((unsigned char) c))
/* Do nothing. */;
Essentially in the main for (;;) loop you would have to compare every
character for '#'. If one is found then you have to do a loop similar
to that and then break afterwards.
Since the file format is for the most part "one shell per line", with
a few caveats, I think it makes sense to just use getline(). Makes
things easier to read in my opinion.
Judging by the copyright date on that file (1991), I assume getline
wasn't portable enough to use (or didn't exist?) then. If I remember
correctly it was a glibc extension that was later adopted by POSIX.
>> The patch removes 'GNULIB_GETUSERSHELL_SINGLE_THREAD' which uses
>> getc_unlocked instead of getc for optimization ...
>>
>> I don't see too much harm in it though since on most platforms getline
>> should lock much less then repeatedly calling getc.
>
> Right. When the function uses getline instead of getc, it will lock 10x
> less than before. I agree there is not much need to optimize this.
> Especially since we don't have a getline_unlocked function. And since
> locking has become cheaper since this code was written. (Futexes were
> invented in 2002-2003.)
Interesting. I don't know very much about Futexes.
Do most libc implementations keep track of if a stream needs to be
locked? I'm thinking about single-threaded programs that don't define
GNULIB_GETUSERSHELL_SINGLE_THREAD. I imagine the stream doesn't locked
x10 more, but a status flag is checked x10 more or something.
Collin
From 9b67292f9dd7c1b0ecfd15993ee43a99eb557b76 Mon Sep 17 00:00:00 2001
From: Collin Funk <collin.fu...@gmail.com>
Date: Mon, 20 May 2024 21:03:39 -0700
Subject: [PATCH] getusershell: Split file by lines instead of spaces.
* lib/getusershell.c: Include string.h and filename.h
(GNULIB_GETUSERSHELL_SINGLE_THREAD): Remove conditional to include
unlocked stdio functions that are no longer used.
(readname): Remove function.
(getusershell): Use getline and process the string instead of using
readname. Return the first absolute file name.
* modules/getusershell (Depends-on): Remove unlocked-io-internal.
Add getline and filename.
* doc/multithread.texi (Multithreading Optimizations): Don't mention
GNULIB_GETUSERSHELL_SINGLE_THREAD.
---
ChangeLog | 14 +++++++++
doc/multithread.texi | 4 ---
lib/getusershell.c | 73 +++++++++++++++++++-------------------------
modules/getusershell | 3 +-
4 files changed, 47 insertions(+), 47 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 256b56580c..157041e2ea 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2024-05-20 Collin Funk <collin.fu...@gmail.com>
+
+ getusershell: Split file by lines instead of spaces.
+ * lib/getusershell.c: Include string.h and filename.h
+ (GNULIB_GETUSERSHELL_SINGLE_THREAD): Remove conditional to include
+ unlocked stdio functions that are no longer used.
+ (readname): Remove function.
+ (getusershell): Use getline and process the string instead of using
+ readname. Return the first absolute file name.
+ * modules/getusershell (Depends-on): Remove unlocked-io-internal.
+ Add getline and filename.
+ * doc/multithread.texi (Multithreading Optimizations): Don't mention
+ GNULIB_GETUSERSHELL_SINGLE_THREAD.
+
2024-05-20 Bruno Haible <br...@clisp.org>
vasnprintf: Don't abort for pseudo-denormal arguments on macOS 12.
diff --git a/doc/multithread.texi b/doc/multithread.texi
index 7d8126e8f3..424e1d2a75 100644
--- a/doc/multithread.texi
+++ b/doc/multithread.texi
@@ -293,10 +293,6 @@ @node Multithreading Optimizations
You can get this macro defined by including the Gnulib module
@code{wchar-single}.
@item
-You may define the C macro @code{GNULIB_GETUSERSHELL_SINGLE_THREAD}, if all the
-programs in your package invoke the functions @code{setusershell},
-@code{getusershell}, @code{endusershell} only from a single thread.
-@item
You may define the C macro @code{GNULIB_EXCLUDE_SINGLE_THREAD}, if all the
programs in your package invoke the functions of the @code{exclude} module
only from a single thread.
diff --git a/lib/getusershell.c b/lib/getusershell.c
index 4da5bff37f..ef86c68a1f 100644
--- a/lib/getusershell.c
+++ b/lib/getusershell.c
@@ -34,17 +34,13 @@
#endif
#include <stdlib.h>
+#include <string.h>
#include <ctype.h>
#include "stdio--.h"
+#include "filename.h"
#include "xalloc.h"
-#if GNULIB_GETUSERSHELL_SINGLE_THREAD
-# include "unlocked-io.h"
-#endif
-
-static idx_t readname (char **, idx_t *, FILE *);
-
#if ! defined ADDITIONAL_DEFAULT_SHELLS && defined __MSDOS__
# define ADDITIONAL_DEFAULT_SHELLS \
"c:/dos/command.com", "c:/windows/command.com", "c:/command.com",
@@ -70,7 +66,7 @@ static FILE *shellstream = NULL;
static char *line = NULL;
/* Number of bytes allocated for 'line'. */
-static idx_t line_size = 0;
+static size_t line_size = 0;
/* Return an entry from the shells file, ignoring comment lines.
If the file doesn't exist, use the list in DEFAULT_SHELLS (above).
@@ -99,12 +95,36 @@ getusershell (void)
}
}
- while (readname (&line, &line_size, shellstream))
+ /* Get the next shell. */
+ for (;;)
{
- if (*line != '#')
- return line;
+ ssize_t nread = getline (&line, &line_size, shellstream);
+
+ /* End of file. */
+ if (nread == -1)
+ return NULL;
+ /* Skip empty lines. */
+ else if (nread > 1)
+ {
+ char *start = line;
+ char *comment = strchr (start, '#');
+ char *end = comment ? comment : start + nread - 1;
+
+ /* Trim rightmost comment marker or newline. */
+ *end-- = '\0';
+
+ /* Skip leading whitespace. */
+ while (start < end && isspace ((unsigned char) *start))
+ start++;
+ /* Trim trailing whitespace. */
+ while (start < end && isspace ((unsigned char) *end))
+ *end-- = '\0';
+
+ /* Only return absolute file names. */
+ if (start < end && IS_ABSOLUTE_FILE_NAME (start))
+ return start;
+ }
}
- return NULL; /* End of file. */
}
/* Rewind the shells file. */
@@ -129,37 +149,6 @@ endusershell (void)
}
}
-/* Read a line from STREAM, removing any newline at the end.
- Place the result in *NAME, which is malloc'd
- and/or realloc'd as necessary and can start out NULL,
- and whose size is passed and returned in *SIZE.
-
- Return the number of bytes placed in *NAME
- if some nonempty sequence was found, otherwise 0. */
-
-static idx_t
-readname (char **name, idx_t *size, FILE *stream)
-{
- int c;
- size_t name_index = 0;
-
- /* Skip blank space. */
- while ((c = getc (stream)) != EOF && isspace (c))
- /* Do nothing. */ ;
-
- for (;;)
- {
- if (*size <= name_index)
- *name = xpalloc (*name, size, 1, -1, sizeof **name);
- if (c == EOF || isspace (c))
- break;
- (*name)[name_index++] = c;
- c = getc (stream);
- }
- (*name)[name_index] = '\0';
- return name_index;
-}
-
#ifdef TEST
int
main (void)
diff --git a/modules/getusershell b/modules/getusershell
index 2f47c62e3d..a4a40852ac 100644
--- a/modules/getusershell
+++ b/modules/getusershell
@@ -9,7 +9,8 @@ Depends-on:
unistd
extensions
fopen-safer [test $HAVE_GETUSERSHELL = 0 || test $REPLACE_GETUSERSHELL = 1]
-unlocked-io-internal [test $HAVE_GETUSERSHELL = 0 || test $REPLACE_GETUSERSHELL = 1]
+getline [test $HAVE_GETUSERSHELL = 0 || test $REPLACE_GETUSERSHELL = 1]
+filename [test $HAVE_GETUSERSHELL = 0 || test $REPLACE_GETUSERSHELL = 1]
xalloc [test $HAVE_GETUSERSHELL = 0 || test $REPLACE_GETUSERSHELL = 1]
configure.ac:
--
2.45.1