After finding a similar problem in GNU grep, I audited coreutils for
issues involving reading stdin twice, or neglecting to report read
errors, and installed the attached. The 2nd patch does the real work;
the rest is merely doc or Gnulib patches.
>From b6a0654e04482f400eb2d5752ec13e15eb53743c Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 22 Aug 2021 11:24:29 -0700
Subject: [PATCH 1/4] doc: spell out stdin, stdout, stderr
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* doc/coreutils.texi: Spell out words like “stdin” in
English prose.
---
doc/coreutils.texi | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 9cc14c008..a435ed63e 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2291,7 +2291,8 @@ basenc @var{encoding} --decode [@var{option}]@dots{} [@var{file}]
@end example
The @var{encoding} argument is required. If @var{file} is omitted,
-reads input from stdin. The @option{-w/--wrap},@option{-i/--ignore-garbage},
+@command{basenc} reads from standard input.
+The @option{-w/--wrap},@option{-i/--ignore-garbage},
@option{-d/--decode} options of this command are precisely the same as
for @command{base64}. @xref{base64 invocation}.
@@ -3356,11 +3357,11 @@ Split @var{input} to @var{chunks} output files where @var{chunks} may be:
@example
@var{n} generate @var{n} files based on current size of @var{input}
-@var{k}/@var{n} only output @var{k}th of @var{n} to stdout
+@var{k}/@var{n} output only @var{k}th of @var{n} to standard output
l/@var{n} generate @var{n} files without splitting lines or records
-l/@var{k}/@var{n} likewise but only output @var{k}th of @var{n} to stdout
+l/@var{k}/@var{n} likewise but output only @var{k}th of @var{n} to stdout
r/@var{n} like @samp{l} but use round robin distribution
-r/@var{k}/@var{n} likewise but only output @var{k}th of @var{n} to stdout
+r/@var{k}/@var{n} likewise but output only @var{k}th of @var{n} to stdout
@end example
Any excess bytes remaining after dividing the @var{input}
@@ -4050,7 +4051,7 @@ for reading standard input when standard input is a terminal.
@item -c
@itemx --check
Read file names and checksum information (not data) from each
-@var{file} (or from stdin if no @var{file} was specified) and report
+@var{file} (or from standard input if no @var{file} was specified) and report
whether the checksums match the contents of the named files.
The input to this mode of @command{md5sum} is usually the output of
a prior, checksum-generating run of @samp{md5sum}.
@@ -4579,7 +4580,7 @@ of the line being used in the sort.
@item --debug
Highlight the portion of each line used for sorting.
-Also issue warnings about questionable usage to stderr.
+Also issue warnings about questionable usage to standard error.
@item --batch-size=@var{nmerge}
@opindex --batch-size
@@ -6228,7 +6229,7 @@ $ paste num2 let3 num2
@ c
@end example
-Intermix lines from stdin:
+Intermix lines from standard input:
@example
$ paste - let3 - < num2
1 a 2
@@ -9186,7 +9187,7 @@ The @var{level} value can be one of the following:
@item none
@opindex none @r{dd status=}
-Do not print any informational or warning messages to stderr.
+Do not print any informational or warning messages to standard error.
Error messages are output as normal.
@item noxfer
@@ -9196,14 +9197,14 @@ that normally make up the last status line.
@item progress
@opindex progress @r{dd status=}
-Print the transfer rate and volume statistics on stderr,
+Print the transfer rate and volume statistics on standard error,
when processing each input block. Statistics are output
on a single line at most once every second, but updates
can be delayed when waiting on I/O.
@end table
-Transfer information is normally output to stderr upon
+Transfer information is normally output to standard error upon
receipt of the @samp{INFO} signal or when @command{dd} exits,
and defaults to the following form in the C locale:
@@ -13837,7 +13838,7 @@ it's described here.
@pindex tee
@cindex pipe fitting
@cindex destinations, multiple output
-@cindex read from stdin and write to stdout and files
+@cindex read from standard input and write to standard output and files
The @command{tee} command copies standard input to standard output and also
to any files given as arguments. This is useful when you want not only
@@ -13941,7 +13942,7 @@ so it works with @command{zsh}, @command{bash}, and @command{ksh},
but not with @command{/bin/sh}. So if you write code like this
in a shell script, be sure to start the script with @samp{#!/bin/bash}.
-Note also that if any of the process substitutions (or piped stdout)
+Note also that if any of the process substitutions (or piped standard output)
might exit early without consuming all the data, the @option{-p} option
is needed to allow @command{tee} to continue to process the input
to any remaining outputs.
@@ -17462,7 +17463,7 @@ env --default-signal=INT,PIPE --ignore-signal=INT
Block signal(s) @var{sig} from being delivered.
@item --list-signal-handling
-List blocked or ignored signals to stderr, before executing a command.
+List blocked or ignored signals to standard error, before executing a command.
@item -v
@itemx --debug
@@ -18234,7 +18235,7 @@ or a number. @xref{Signal specifications}.
@itemx --verbose
@opindex -v
@opindex --verbose
-Diagnose to stderr, any signal sent upon timeout.
+Diagnose to standard error, any signal sent upon timeout.
@end table
@cindex time units
--
2.31.1
>From 074424d5666e3499132421d67dfe50e1702bcbae Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 22 Aug 2021 11:54:44 -0700
Subject: [PATCH 2/4] maint: use clearerr on stdin when appropriate
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This is so that commands like ‘fmt - -’ read from stdin
both times, even when it is a tty. Fix some other minor
issues that are related.
* src/blake2/b2sum.c (main):
* src/cksum.c (cksum):
* src/cut.c (cut_file):
* src/expand-common.c (next_file):
* src/fmt.c (fmt):
* src/fold.c (fold_file):
* src/md5sum.c (digest_file, digest_check):
* src/nl.c (nl_file):
* src/od.c (check_and_close):
* src/paste.c (paste_parallel, paste_serial):
* src/pr.c (close_file):
* src/sum.c (bsd_sum_file):
Use clearerr on stdin so that stdin can be read multiple times
even if it is a tty. Do not assume that ferror preserves errno as
POSIX does not guarantee this. Coalesce duplicate diagnostic
calls.
* src/blake2/b2sum.c (main):
* src/fmt.c (main, fmt):
Report read error, even if it's merely fclose failure.
* src/fmt.c: Include die.h.
(fmt): New arg FILE. Close input (reporting error) if not stdin.
All callers changed.
* src/ptx.c (swallow_file_in_memory): Clear stdin's EOF flag.
* src/sort.c (xfclose): Remove unnecessary feof call.
---
src/blake2/b2sum.c | 5 ++++-
src/cksum.c | 19 +++++++++----------
src/cut.c | 12 ++++++------
src/expand-common.c | 12 ++++++------
src/fmt.c | 45 +++++++++++++++++++++++++++++++--------------
src/fold.c | 16 ++++++++--------
src/md5sum.c | 31 +++++++++++++++----------------
src/nl.c | 14 +++++++-------
src/od.c | 17 ++++++++---------
src/paste.c | 26 ++++++++++++--------------
src/pr.c | 14 ++++++++++----
src/ptx.c | 3 +++
src/sort.c | 3 +--
src/sum.c | 19 +++++++++----------
14 files changed, 129 insertions(+), 107 deletions(-)
diff --git a/src/blake2/b2sum.c b/src/blake2/b2sum.c
index 9f1108137..0a2387b39 100644
--- a/src/blake2/b2sum.c
+++ b/src/blake2/b2sum.c
@@ -388,7 +388,10 @@ int main( int argc, char **argv )
printf( " %s\n", argv[i] );
}
- if( f != stdin ) fclose( f );
+ if( f == stdin )
+ clearerr( f );
+ else if( fclose( f ) != 0 )
+ fprintf( stderr, "Could not close `%s': %s\n", argv[i], strerror( errno ) );
}
return 0;
diff --git a/src/cksum.c b/src/cksum.c
index c3416f866..3cc4296bc 100644
--- a/src/cksum.c
+++ b/src/cksum.c
@@ -298,17 +298,16 @@ cksum (char const *file, bool print_name)
if (! cksum_fp (fp, file, &crc, &length))
return false;
- if (ferror (fp))
- {
- error (0, errno, "%s", quotef (file));
- if (!STREQ (file, "-"))
- fclose (fp);
- return false;
- }
-
- if (!STREQ (file, "-") && fclose (fp) == EOF)
+ int err = errno;
+ if (!ferror (fp))
+ err = 0;
+ if (STREQ (file, "-"))
+ clearerr (fp);
+ else if (fclose (fp) != 0 && !err)
+ err = errno;
+ if (err)
{
- error (0, errno, "%s", quotef (file));
+ error (0, err, "%s", quotef (file));
return false;
}
diff --git a/src/cut.c b/src/cut.c
index f4d44c211..cdf33d897 100644
--- a/src/cut.c
+++ b/src/cut.c
@@ -460,16 +460,16 @@ cut_file (char const *file)
cut_stream (stream);
- if (ferror (stream))
- {
- error (0, errno, "%s", quotef (file));
- return false;
- }
+ int err = errno;
+ if (!ferror (stream))
+ err = 0;
if (STREQ (file, "-"))
clearerr (stream); /* Also clear EOF. */
else if (fclose (stream) == EOF)
+ err = errno;
+ if (err)
{
- error (0, errno, "%s", quotef (file));
+ error (0, err, "%s", quotef (file));
return false;
}
return true;
diff --git a/src/expand-common.c b/src/expand-common.c
index 55df8dc0f..4deb7bd8a 100644
--- a/src/expand-common.c
+++ b/src/expand-common.c
@@ -338,16 +338,16 @@ next_file (FILE *fp)
if (fp)
{
assert (prev_file);
- if (ferror (fp))
- {
- error (0, errno, "%s", quotef (prev_file));
- exit_status = EXIT_FAILURE;
- }
+ int err = errno;
+ if (!ferror (fp))
+ err = 0;
if (STREQ (prev_file, "-"))
clearerr (fp); /* Also clear EOF. */
else if (fclose (fp) != 0)
+ err = errno;
+ if (err)
{
- error (0, errno, "%s", quotef (prev_file));
+ error (0, err, "%s", quotef (prev_file));
exit_status = EXIT_FAILURE;
}
}
diff --git a/src/fmt.c b/src/fmt.c
index ca9231b99..bfccd9ba1 100644
--- a/src/fmt.c
+++ b/src/fmt.c
@@ -28,6 +28,7 @@
#include "system.h"
#include "error.h"
+#include "die.h"
#include "fadvise.h"
#include "xdectoint.h"
@@ -151,7 +152,7 @@ struct Word
/* Forward declarations. */
static void set_prefix (char *p);
-static void fmt (FILE *f);
+static bool fmt (FILE *f, char const *);
static bool get_paragraph (FILE *f);
static int get_line (FILE *f, int c);
static int get_prefix (FILE *f);
@@ -412,28 +413,29 @@ main (int argc, char **argv)
goal_width = max_width * (2 * (100 - LEEWAY) + 1) / 200;
}
+ bool have_read_stdin = false;
+
if (optind == argc)
- fmt (stdin);
+ {
+ have_read_stdin = true;
+ ok = fmt (stdin, "-");
+ }
else
{
for (; optind < argc; optind++)
{
char *file = argv[optind];
if (STREQ (file, "-"))
- fmt (stdin);
+ {
+ ok &= fmt (stdin, file);
+ have_read_stdin = true;
+ }
else
{
FILE *in_stream;
in_stream = fopen (file, "r");
if (in_stream != NULL)
- {
- fmt (in_stream);
- if (fclose (in_stream) == EOF)
- {
- error (0, errno, "%s", quotef (file));
- ok = false;
- }
- }
+ ok &= fmt (in_stream, file);
else
{
error (0, errno, _("cannot open %s for reading"),
@@ -444,6 +446,9 @@ main (int argc, char **argv)
}
}
+ if (have_read_stdin && fclose (stdin) != 0)
+ die (EXIT_FAILURE, errno, "%s", _("closing standard input"));
+
return ok ? EXIT_SUCCESS : EXIT_FAILURE;
}
@@ -470,10 +475,13 @@ set_prefix (char *p)
prefix_length = s - p;
}
-/* read file F and send formatted output to stdout. */
+/* Read F and send formatted output to stdout.
+ Close F when done, unless F is stdin. Diagnose input errors, using FILE.
+ If !F, assume F resulted from an fopen failure and diagnose that.
+ Return true if successful. */
-static void
-fmt (FILE *f)
+static bool
+fmt (FILE *f, char const *file)
{
fadvise (f, FADVISE_SEQUENTIAL);
tabs = false;
@@ -484,6 +492,15 @@ fmt (FILE *f)
fmt_paragraph ();
put_paragraph (word_limit);
}
+
+ int err = ferror (f) ? 0 : -1;
+ if (f == stdin)
+ clearerr (f);
+ else if (fclose (f) != 0 && err < 0)
+ err = errno;
+ if (0 <= err)
+ error (0, err, err ? "%s" : _("read error"), quotef (file));
+ return err < 0;
}
/* Set the global variable 'other_indent' according to SAME_PARAGRAPH
diff --git a/src/fold.c b/src/fold.c
index ae33dd368..94a6d378e 100644
--- a/src/fold.c
+++ b/src/fold.c
@@ -216,20 +216,20 @@ fold_file (char const *filename, size_t width)
}
saved_errno = errno;
+ if (!ferror (istream))
+ saved_errno = 0;
if (offset_out)
fwrite (line_out, sizeof (char), (size_t) offset_out, stdout);
- if (ferror (istream))
+ if (STREQ (filename, "-"))
+ clearerr (istream);
+ else if (fclose (istream) != 0 && !saved_errno)
+ saved_errno = errno;
+
+ if (saved_errno)
{
error (0, saved_errno, "%s", quotef (filename));
- if (!STREQ (filename, "-"))
- fclose (istream);
- return false;
- }
- if (!STREQ (filename, "-") && fclose (istream) == EOF)
- {
- error (0, errno, "%s", quotef (filename));
return false;
}
diff --git a/src/md5sum.c b/src/md5sum.c
index cbfdc3ab2..e2071cfd2 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -631,17 +631,15 @@ digest_file (char const *filename, int *binary, unsigned char *bin_result,
#else
err = DIGEST_STREAM (fp, bin_result);
#endif
- if (err)
- {
- error (0, errno, "%s", quotef (filename));
- if (fp != stdin)
- fclose (fp);
- return false;
- }
+ err = err ? errno : 0;
+ if (is_stdin)
+ clearerr (fp);
+ else if (fclose (fp) != 0 && !err)
+ err = errno;
- if (!is_stdin && fclose (fp) != 0)
+ if (err)
{
- error (0, errno, "%s", quotef (filename));
+ error (0, err, "%s", quotef (filename));
return false;
}
@@ -798,15 +796,16 @@ digest_check (char const *checkfile_name)
free (line);
- if (ferror (checkfile_stream))
- {
- error (0, 0, _("%s: read error"), quotef (checkfile_name));
- return false;
- }
+ int err = ferror (checkfile_stream) ? 0 : -1;
+ if (is_stdin)
+ clearerr (checkfile_stream);
+ else if (fclose (checkfile_stream) != 0 && err < 0)
+ err = errno;
- if (!is_stdin && fclose (checkfile_stream) != 0)
+ if (0 <= err)
{
- error (0, errno, "%s", quotef (checkfile_name));
+ error (0, err, err ? "%s" : _("%s: read error"),
+ quotef (checkfile_name));
return false;
}
diff --git a/src/nl.c b/src/nl.c
index f3ba46c9b..7a13bcb97 100644
--- a/src/nl.c
+++ b/src/nl.c
@@ -457,16 +457,16 @@ nl_file (char const *file)
process_file (stream);
- if (ferror (stream))
- {
- error (0, errno, "%s", quotef (file));
- return false;
- }
+ int err = errno;
+ if (!ferror (stream))
+ err = 0;
if (STREQ (file, "-"))
clearerr (stream); /* Also clear EOF. */
- else if (fclose (stream) == EOF)
+ else if (fclose (stream) != 0 && !err)
+ err = errno;
+ if (err)
{
- error (0, errno, "%s", quotef (file));
+ error (0, err, "%s", quotef (file));
return false;
}
return true;
diff --git a/src/od.c b/src/od.c
index f04e0ccb7..111c94935 100644
--- a/src/od.c
+++ b/src/od.c
@@ -949,16 +949,15 @@ check_and_close (int in_errno)
if (in_stream != NULL)
{
- if (ferror (in_stream))
+ if (!ferror (in_stream))
+ in_errno = 0;
+ if (STREQ (file_list[-1], "-"))
+ clearerr (in_stream);
+ else if (fclose (in_stream) != 0 && !in_errno)
+ in_errno = errno;
+ if (in_errno)
{
- error (0, in_errno, _("%s: read error"), quotef (input_filename));
- if (! STREQ (file_list[-1], "-"))
- fclose (in_stream);
- ok = false;
- }
- else if (! STREQ (file_list[-1], "-") && fclose (in_stream) != 0)
- {
- error (0, errno, "%s", quotef (input_filename));
+ error (0, in_errno, "%s", quotef (input_filename));
ok = false;
}
diff --git a/src/paste.c b/src/paste.c
index 48229acc5..f43fb56c2 100644
--- a/src/paste.c
+++ b/src/paste.c
@@ -266,16 +266,15 @@ paste_parallel (size_t nfiles, char **fnamptr)
If an EOF or error, close the file. */
if (fileptr[i])
{
- if (ferror (fileptr[i]))
- {
- error (0, err, "%s", quotef (fnamptr[i]));
- ok = false;
- }
+ if (!ferror (fileptr[i]))
+ err = 0;
if (fileptr[i] == stdin)
clearerr (fileptr[i]); /* Also clear EOF. */
- else if (fclose (fileptr[i]) == EOF)
+ else if (fclose (fileptr[i]) == EOF && !err)
+ err = errno;
+ if (err)
{
- error (0, errno, "%s", quotef (fnamptr[i]));
+ error (0, err, "%s", quotef (fnamptr[i]));
ok = false;
}
@@ -410,16 +409,15 @@ paste_serial (size_t nfiles, char **fnamptr)
if (charold != line_delim)
xputchar (line_delim);
- if (ferror (fileptr))
- {
- error (0, saved_errno, "%s", quotef (*fnamptr));
- ok = false;
- }
+ if (!ferror (fileptr))
+ saved_errno = 0;
if (is_stdin)
clearerr (fileptr); /* Also clear EOF. */
- else if (fclose (fileptr) == EOF)
+ else if (fclose (fileptr) != 0 && !saved_errno)
+ saved_errno = errno;
+ if (saved_errno)
{
- error (0, errno, "%s", quotef (*fnamptr));
+ error (0, saved_errno, "%s", quotef (*fnamptr));
ok = false;
}
}
diff --git a/src/pr.c b/src/pr.c
index da5795554..8f84d0f59 100644
--- a/src/pr.c
+++ b/src/pr.c
@@ -1506,10 +1506,16 @@ close_file (COLUMN *p)
if (p->status == CLOSED)
return;
- if (ferror (p->fp))
- die (EXIT_FAILURE, errno, "%s", quotef (p->name));
- if (fileno (p->fp) != STDIN_FILENO && fclose (p->fp) != 0)
- die (EXIT_FAILURE, errno, "%s", quotef (p->name));
+
+ int err = errno;
+ if (!ferror (p->fp))
+ err = 0;
+ if (fileno (p->fp) == STDIN_FILENO)
+ clearerr (p->fp);
+ else if (fclose (p->fp) != 0 && !err)
+ err = errno;
+ if (err)
+ die (EXIT_FAILURE, err, "%s", quotef (p->name));
if (!parallel_files)
{
diff --git a/src/ptx.c b/src/ptx.c
index 85c26aa1d..43075c840 100644
--- a/src/ptx.c
+++ b/src/ptx.c
@@ -526,6 +526,9 @@ swallow_file_in_memory (char const *file_name, BLOCK *block)
if (!block->start)
die (EXIT_FAILURE, errno, "%s", quotef (using_stdin ? "-" : file_name));
+ if (using_stdin)
+ clearerr (stdin);
+
block->end = block->start + used_length;
}
diff --git a/src/sort.c b/src/sort.c
index cba809c33..5f4c817de 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1001,8 +1001,7 @@ xfclose (FILE *fp, char const *file)
{
case STDIN_FILENO:
/* Allow reading stdin from tty more than once. */
- if (feof (fp))
- clearerr (fp);
+ clearerr (fp);
break;
case STDOUT_FILENO:
diff --git a/src/sum.c b/src/sum.c
index f9641dbb1..c17af3f6b 100644
--- a/src/sum.c
+++ b/src/sum.c
@@ -120,17 +120,16 @@ bsd_sum_file (char const *file, int print_name)
checksum &= 0xffff; /* Keep it within bounds. */
}
- if (ferror (fp))
- {
- error (0, errno, "%s", quotef (file));
- if (!is_stdin)
- fclose (fp);
- return false;
- }
-
- if (!is_stdin && fclose (fp) != 0)
+ int err = errno;
+ if (!ferror (fp))
+ err = 0;
+ if (is_stdin)
+ clearerr (fp);
+ else if (fclose (fp) != 0 && !err)
+ err = errno;
+ if (err)
{
- error (0, errno, "%s", quotef (file));
+ error (0, err, "%s", quotef (file));
return false;
}
--
2.31.1
>From 6e93074899332234d23a1698dc7f4168a793131d Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 22 Aug 2021 11:56:54 -0700
Subject: [PATCH 3/4] build: update gnulib submodule to latest
---
gnulib | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gnulib b/gnulib
index 53701517c..4ea0e64a8 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 53701517c6517e5161e4533e44c33b5e4db17314
+Subproject commit 4ea0e64a8db7064427f6aa5624a4efd4b41db132
--
2.31.1
>From 0b2148bdd0d93f16bef64d899a21487f33751445 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 22 Aug 2021 12:42:20 -0700
Subject: [PATCH 4/4] df: pacify -Wsuggest-attribute=malloc
Problem found with latest Gnulib and GCC 11.2.1.
* src/find-mount-point.h (find_mount_point):
Add _GL_ATTRIBUTE_MALLOC and _GL_ATTRIBUTE_DEALLOC_FREE.
---
src/find-mount-point.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/find-mount-point.h b/src/find-mount-point.h
index 028b2500c..a1bbcdc92 100644
--- a/src/find-mount-point.h
+++ b/src/find-mount-point.h
@@ -14,4 +14,7 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>. */
-extern char *find_mount_point (char const *, struct stat const *);
+#include <stdlib.h>
+
+extern char *find_mount_point (char const *, struct stat const *)
+ _GL_ATTRIBUTE_MALLOC _GL_ATTRIBUTE_DEALLOC_FREE;
--
2.31.1