Paul Eggert wrote:
On 03/01/2016 02:05 AM, Marcello Perathoner wrote:
2) If you just output
binary line 42 in file x matches
and continue regular output after the next newline, the breakage would be much
more confined.
This sounds like a good suggestion. That is, grep could keep going if its only
problem is an attempt to output encoding errors (as opposed to reading null
bytes, which are a more-reliable indication of binary data). It would probably
be better to output just one "Binary file matches" line per file, at the end of
the other matches, so that it's more likely to be noticed.
I finally got around to implementing this, which turned out to be considerably
easier than I thought it would be. I installed the attached patch into the grep
Savannah master. I am boldly closing this old bug report; we can always start a
new report if further problems turn up.
From 0f1fb0747fdac7043124df4cead5c845bd64fd77 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Thu, 8 Sep 2016 18:33:14 -0700
Subject: [PATCH] grep: encoding errors suppress just their line
From a suggestion by Marcello Perathoner (Bug#22838).
* NEWS, doc/grep.texi (File and Directory Selection): Document this.
* src/grep.c (print_line_head): Do not suppress later output lines
merely because an earlier output line would have had an encoding error.
* tests/encoding-error: Test for the new behavior.
---
NEWS | 5 +++++
doc/grep.texi | 13 +++++++------
src/grep.c | 13 ++++++-------
tests/encoding-error | 4 ++++
4 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/NEWS b/NEWS
index 01be350..a63a7b2 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ GNU grep NEWS -*- outline -*-
* Noteworthy changes in release ?.? (????-??-??) [?]
+** Bug fixes
+
+ Grep no longer omits output merely because it follows an output line
+ suppressed due to encoding errors. [bug introduced in grep-2.21]
+
** Improvements
grep can be much faster now when standard output is /dev/null.
diff --git a/doc/grep.texi b/doc/grep.texi
index 7e51d45..fcfad42 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -610,18 +610,19 @@ Variables}), or null input bytes when the
@option{-z} (@option{--null-data}) option is not given (@pxref{Other
Options}).
-By default, @var{type} is @samp{binary}, and when @command{grep}
-discovers that a file is binary it suppresses any further output, and
-instead outputs either a one-line message saying that a binary file
-matches, or no message if there is no match.
+By default, @var{type} is @samp{binary}, and @command{grep}
+suppresses output afer null input binary data is discovered,
+and suppresses output lines that contain improperly encoded data.
+When some output is suppressed, @command{grep} follows any output
+with a one-line message saying that a binary file matches.
If @var{type} is @samp{without-match},
-when @command{grep} discovers that a file is binary
+when @command{grep} discovers null input binary data
it assumes that the rest of the file does not match;
this is equivalent to the @option{-I} option.
If @var{type} is @samp{text},
-@command{grep} processes a binary file as if it were text;
+@command{grep} processes binary data as if it were text;
this is equivalent to the @option{-a} option.
When @var{type} is @samp{binary}, @command{grep} may treat non-text
diff --git a/src/grep.c b/src/grep.c
index d07f5da..65916ca 100644
--- a/src/grep.c
+++ b/src/grep.c
@@ -1108,17 +1108,16 @@ print_offset (uintmax_t pos, int min_width, const char *color)
static bool
print_line_head (char *beg, size_t len, char const *lim, char sep)
{
- bool encoding_errors = false;
if (binary_files != TEXT_BINARY_FILES)
{
char ch = beg[len];
- encoding_errors = buf_has_encoding_errors (beg, len);
+ bool encoding_errors = buf_has_encoding_errors (beg, len);
beg[len] = ch;
- }
- if (encoding_errors)
- {
- encoding_error_output = done_on_match = out_quiet = true;
- return false;
+ if (encoding_errors)
+ {
+ encoding_error_output = true;
+ return false;
+ }
}
bool pending_sep = false;
diff --git a/tests/encoding-error b/tests/encoding-error
index 4b5fcb5..0cbeffc 100755
--- a/tests/encoding-error
+++ b/tests/encoding-error
@@ -35,6 +35,10 @@ grep '^X' in >out
test $? = 1 || fail=1
compare /dev/null out || fail=1
+grep . in >out || fail=1
+(cat a j && printf 'Binary file in matches\n') >exp || framework_failure_
+compare exp out || fail=1
+
grep -a . in >out || fail=1
compare in out
--
2.7.4