Jim Meyering wrote:
The -Pz/PCRE problem is more fundamental, and strikes
even with LC_ALL=C.

grep should report an error when this problem occurs, rather than silently give incorrect answers. I installed the attached patch in a bold attempt to implement this; please feel free to revert if you think it goes too far.
>From b824e32f5754b93db25b779d2484472120c45ac2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sun, 21 Feb 2016 11:38:00 -0800
Subject: [PATCH] grep: -Pz is incompatible with ^ and $

Problem reported by Sergei Trofimovich in: http://bugs.gnu.org/22655
* NEWS: Document this.
* src/pcresearch.c (Pcompile): Warn with -Pz and anchors.
* tests/pcre: Test new behavior.
---
 NEWS             |  7 +++++++
 src/pcresearch.c | 14 ++++++++++++++
 tests/pcre       |  6 ++----
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index ae238be..990118e 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,13 @@ GNU grep NEWS                                    -*- outline -*-
   there is no match, as expected.
   [bug introduced in grep-2.7]
 
+  grep -Pz now diagnoses attempts to use patterns containing ^ and $,
+  instead of mishandling these patterns.  This problem seems to be
+  inherent to the PCRE API; removing this limitation is on PCRE's
+  maint/README wish list.  Patterns can continue to match literal ^
+  and $ by escaping them with \ (now needed even inside [...]).
+  [bug introduced in grep-2.5]
+
 
 * Noteworthy changes in release 2.23 (2016-02-04) [stable]
 
diff --git a/src/pcresearch.c b/src/pcresearch.c
index 3fee67a..3b8e795 100644
--- a/src/pcresearch.c
+++ b/src/pcresearch.c
@@ -124,6 +124,20 @@ Pcompile (char const *pattern, size_t size)
   /* FIXME: Remove these restrictions.  */
   if (memchr (pattern, '\n', size))
     error (EXIT_TROUBLE, 0, _("the -P option only supports a single pattern"));
+  if (! eolbyte)
+    {
+      bool escaped = false;
+      for (p = pattern; *p; p++)
+        if (escaped)
+          escaped = false;
+        else
+          {
+            escaped = *p == '\\';
+            if (*p == '^' || *p == '$')
+              error (EXIT_TROUBLE, 0,
+                     _("unescaped ^ or $ not supported with -Pz"));
+          }
+    }
 
   *n = '\0';
   if (match_words)
diff --git a/tests/pcre b/tests/pcre
index 92e788e..b8b4662 100755
--- a/tests/pcre
+++ b/tests/pcre
@@ -13,9 +13,7 @@ require_pcre_
 fail=0
 
 echo | grep -P '\s*$' || fail=1
-echo | grep -zP '\s$' || fail=1
-
-echo '.ab' | grep -Pwx ab
-test $? -eq 1 || fail=1
+echo | returns_ 2 grep -zP '\s$' || fail=1
+echo '.ab' | returns_ 1 grep -Pwx ab || fail=1
 
 Exit $fail
-- 
2.5.0

Reply via email to