Jim Meyering wrote:
The -Pz/PCRE problem is more fundamental, and strikes even with LC_ALL=C.
grep should report an error when this problem occurs, rather than silently give incorrect answers. I installed the attached patch in a bold attempt to implement this; please feel free to revert if you think it goes too far.
>From b824e32f5754b93db25b779d2484472120c45ac2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <egg...@cs.ucla.edu> Date: Sun, 21 Feb 2016 11:38:00 -0800 Subject: [PATCH] grep: -Pz is incompatible with ^ and $ Problem reported by Sergei Trofimovich in: http://bugs.gnu.org/22655 * NEWS: Document this. * src/pcresearch.c (Pcompile): Warn with -Pz and anchors. * tests/pcre: Test new behavior. --- NEWS | 7 +++++++ src/pcresearch.c | 14 ++++++++++++++ tests/pcre | 6 ++---- 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/NEWS b/NEWS index ae238be..990118e 100644 --- a/NEWS +++ b/NEWS @@ -15,6 +15,13 @@ GNU grep NEWS -*- outline -*- there is no match, as expected. [bug introduced in grep-2.7] + grep -Pz now diagnoses attempts to use patterns containing ^ and $, + instead of mishandling these patterns. This problem seems to be + inherent to the PCRE API; removing this limitation is on PCRE's + maint/README wish list. Patterns can continue to match literal ^ + and $ by escaping them with \ (now needed even inside [...]). + [bug introduced in grep-2.5] + * Noteworthy changes in release 2.23 (2016-02-04) [stable] diff --git a/src/pcresearch.c b/src/pcresearch.c index 3fee67a..3b8e795 100644 --- a/src/pcresearch.c +++ b/src/pcresearch.c @@ -124,6 +124,20 @@ Pcompile (char const *pattern, size_t size) /* FIXME: Remove these restrictions. */ if (memchr (pattern, '\n', size)) error (EXIT_TROUBLE, 0, _("the -P option only supports a single pattern")); + if (! eolbyte) + { + bool escaped = false; + for (p = pattern; *p; p++) + if (escaped) + escaped = false; + else + { + escaped = *p == '\\'; + if (*p == '^' || *p == '$') + error (EXIT_TROUBLE, 0, + _("unescaped ^ or $ not supported with -Pz")); + } + } *n = '\0'; if (match_words) diff --git a/tests/pcre b/tests/pcre index 92e788e..b8b4662 100755 --- a/tests/pcre +++ b/tests/pcre @@ -13,9 +13,7 @@ require_pcre_ fail=0 echo | grep -P '\s*$' || fail=1 -echo | grep -zP '\s$' || fail=1 - -echo '.ab' | grep -Pwx ab -test $? -eq 1 || fail=1 +echo | returns_ 2 grep -zP '\s$' || fail=1 +echo '.ab' | returns_ 1 grep -Pwx ab || fail=1 Exit $fail -- 2.5.0