Thanks for pointing out the seeming inconsistency. The documentation mentions
the issue but is perhaps not clear enough, so I installed the attached patch.
The input file contains NUL bytes and so is treated as binary data, and the grep
documentation (secton "File and Directory Selection", option "--binary-files")
says "When processing binary data, ‘grep’ may treat non-text bytes as line
terminators". This behavior was added to GNU grep in release 2.21 dated 2014,
partly for performance reasons.
There are two instances in riddle.he of a space followed by a NUL byte, so
grep -P '[ \t]\r?$' riddles.he
finds a match when the $ matches just before the NUL byte.
-a is one way to get the behavior you evidently expected. Another (perhaps
better) way is -z. The command:
grep -zP '[ \t]\r?\n' riddles.he
outputs nothing and exits with status 1.
>From 7cfd9d20773e1a67cb085a14206fd33274c64387 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Tue, 5 Apr 2016 23:53:30 -0700
Subject: [PATCH] Give another example of binary file processing
Problem reported by Shlomi Fish
* doc/grep.texi (File and Directory Selection):
Document that 'q$' might match 'q' followed by a NUL
if --binary-files=binary is in effect.
---
doc/grep.texi | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/doc/grep.texi b/doc/grep.texi
index 074113b..1d3d5cb 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -607,10 +607,6 @@ By default, @var{type} is @samp{binary}, and when @command{grep}
discovers that a file is binary it suppresses any further output, and
instead outputs either a one-line message saying that a binary file
matches, or no message if there is no match.
-When processing binary data, @command{grep} may treat non-text bytes
-as line terminators; for example, the pattern @samp{.} (period) might
-not match a null byte, as the null byte might be treated as a line
-terminator even without the @option{-z} (@option{--null-data}) option.
If @var{type} is @samp{without-match},
when @command{grep} discovers that a file is binary
@@ -621,6 +617,16 @@ If @var{type} is @samp{text},
@command{grep} processes a binary file as if it were text;
this is equivalent to the @option{-a} option.
+When @var{type} is @samp{binary}, @command{grep} may treat non-text
+bytes as line terminators even without the @option{-z}
+(@option{--null-data}) option. This means choosing @samp{binary}
+versus @samp{text} can affect whether a pattern matches a file. For
+example, when @var{type} is @samp{binary} the pattern @samp{q$} might
+match @samp{q} immediately followed by a null byte, even though this
+is not matched when @var{type} is @samp{text}. Conversely, when
+@var{type} is @samp{binary} the pattern @samp{.} (period) might not
+match a null byte.
+
@emph{Warning:} @samp{--binary-files=text} might output binary garbage,
which can have nasty side effects
if the output is a terminal and
--
2.5.5