On 2/23/22 23:11, Ulrich Eckhardt wrote:
Just for my understanding, grep stops reading
when it finds the first match and then the shell closes the output
stream of cat. That in turn causes cat to fail (exit code 141, meaning
SIGPIPE), because it can't write the rest of the data that it wants,
right?
Right.
I think that short reads (which could cause SIGPIPE) and the
non-error exit code 1 deserve mention there. I'll take a look and
perhaps file another patch.
I installed the attached to try to document this better.
From 89de21bd2525088cf5ce12395515ca1d8ef582a2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Thu, 24 Feb 2022 11:07:14 -0800
Subject: [PATCH] doc: mention issues with set -e
* doc/grep.texi (Usage, Performance): Mention early exits (Bug#54035).
---
doc/grep.texi | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/doc/grep.texi b/doc/grep.texi
index 37ef839..ebbefda 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -1862,6 +1862,22 @@ Use the special file name @samp{-}:
cat /etc/passwd | grep 'alain' - /etc/motd
@end example
+@item
+Why can't I combine the shell's @samp{set -e} with @command{grep}?
+
+The @command{grep} command follows the convention of programs like
+@command{cmp} and @command{diff} where an exit status of 1 is not an
+error. The shell command @samp{set -e} causes the shell to exit if
+any subcommand exits with nonzero status, and this will cause the
+shell to exit merely because @command{grep} selected no lines,
+which is ordinarily not what you want.
+
+There is a related problem with Bash's @command{set -e -o pipefail}.
+Since @command{grep} does not always read all its input, a command
+outputting to a pipe read by @command{grep} can fail when
+@command{grep} exits before reading all its input, and the command's
+failure can cause Bash to exit.
+
@item
Why is this back-reference failing?
@@ -1998,6 +2014,14 @@ needing to read the zeros. This optimization is not available if the
Directory Selection}), unless the @option{-z} (@option{--null-data})
option is also used (@pxref{Other Options}).
+@cindex pipelines and reading
+For efficiency @command{grep} does not always read all its input.
+For example, the shell command @samp{sed '/^...$/d' | grep -q X} can
+cause @command{grep} to exit immediately after reading a line
+containing @samp{X}, without bothering to read the rest of its input data.
+This in turn can cause @command{sed} to exit with a nonzero status because
+@command{sed} cannot write to its output pipe after @command{grep} exits.
+
For more about the algorithms used by @command{grep} and about
related string matching algorithms, see:
--
2.32.0