On Sat, Jan 21, 2017 at 1:14 AM, Norihiro Tanaka <nori...@kcn.ne.jp> wrote:
> grep -Fo may not match longest pattern in grep 2.26 or later including
> current master.
>
> $ printf 'abce\n' > in
> $ printf 'abcd\nc\nbce\n' > pat
> $ LC_ALL=C src/grep -Fof pat in
> c
>
> We expect "bce" in this case.

Nice. I am glad you caught that.
I've adjusted some wording and will push this soon:
From b0cdf48d416b2cbb028a1b65c758035ba7c8a2aa Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <nori...@kcn.ne.jp>
Date: Sat, 21 Jan 2017 18:01:53 +0900
Subject: [PATCH] grep -Fo could report a match that is not the longest

* src/kwset.c (acexec): Fix it.
* tests/fgrep-longest: New test.
* tests/Makefile.am: Add the test.
* NEWS: Mention it.
---
 NEWS                |  4 ++++
 src/kwset.c         | 17 ++++++++++++++---
 tests/fgrep-longest | 23 +++++++++++++++++++++++
 3 files changed, 41 insertions(+), 3 deletions(-)
 create mode 100755 tests/fgrep-longest

diff --git a/NEWS b/NEWS
index 3529f4e..773a8ed 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,10 @@ GNU grep NEWS                                    -*- outline 
-*-

 ** Bug fixes

+  When grep -Fo finds matches of differing length, it could
+  mistakenly print a shorter one.  Now it prints a longest one.
+  [bug introduced in grep-2.26]
+
   When standard output is /dev/null, grep no longer fails when
   standard input is a file in the Linux /proc file system, or when
   standard input is a pipe and standard output is in append mode.
diff --git a/src/kwset.c b/src/kwset.c
index 39a1e15..258cff5 100644
--- a/src/kwset.c
+++ b/src/kwset.c
@@ -848,9 +848,20 @@ acexec_trans (kwset_t kwset, char const *text, ptrdiff_t 
len,
           struct trie const *accept1;
           char const *left1;
           unsigned char c = tr (trans, *tp++);
-          tree = trie->links;
-          while (tree && c != tree->label)
-            tree = c < tree->label ? tree->llink : tree->rlink;
+          while (true)
+            {
+              tree = trie->links;
+              while (tree && c != tree->label)
+                tree = c < tree->label ? tree->llink : tree->rlink;
+              if (tree)
+                break;
+              trie = trie->fail;
+              if (!trie)
+                break;
+              left1 = tp - trie->depth;
+              if (left1 > left)
+                break;
+            }
           if (!tree)
             break;
           trie = tree->trie;
diff --git a/tests/fgrep-longest b/tests/fgrep-longest
new file mode 100755
index 0000000..5974d11
--- /dev/null
+++ b/tests/fgrep-longest
@@ -0,0 +1,23 @@
+#! /bin/sh
+# With multiple matches, grep -Fo could print a shorter one.
+# This bug affected grep versions 2.26 through 2.27.
+#
+# Copyright (C) 2017 Free Software Foundation, Inc.
+#
+# Copying and distribution of this file, with or without modification,
+# are permitted in any medium without royalty provided the copyright
+# notice and this notice are preserved.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+# The erroneous versions would print "c", rather than the longer match, "bce".
+printf 'abce\n' > in || framework_failure_
+printf 'abcd\nc\nbce\n' > pat || framework_failure_
+printf 'bce\n' > exp || framework_failure_
+
+LC_ALL=C grep -Fof pat in > out || fail=1
+compare exp out || fail=1
+
+Exit $fail
-- 
2.9.3

Reply via email to