Example,

  a+a+a
  1 2 3

position 1 has a repetition of "a" and other transition with "a".
position 2 has a repetition of "a" and other transition with "a", too.
Then DFA was merging the two nodes, but it is wrong.

Now similar nodes in series are not merged.
From 88bad5597445650f4e1bca663a82d4e4d14c93f3 Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <nori...@kcn.ne.jp>
Date: Sun, 1 Nov 2020 16:31:38 +0900
Subject: [PATCH] dfa: remain similar nodes in series in optimization

DFA was merging similar nodes illegally, example a+a+a as a+a.
Now similar nodes in series are not merged.

Problem reported by Gonzalo Padrino in:
https://lists.gnu.org/archive/html/bug-grep/2020-10/msg00015.html

* lib/dfa.c (merge_nfa_state): Skip the follow for repetition in
optimization.
---
 lib/dfa.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 74aafa2..6d880c1 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2438,7 +2438,7 @@ merge_nfa_state (struct dfa *d, idx_t tindex, char *flags,
           continue;
         }
 
-      if (!(flags[sindex] & (OPT_LPAREN | OPT_RPAREN)))
+      if (sindex != tindex && !(flags[sindex] & (OPT_LPAREN | OPT_RPAREN)))
         {
           idx_t j;
 
@@ -2446,6 +2446,9 @@ merge_nfa_state (struct dfa *d, idx_t tindex, char *flags,
             {
               idx_t dindex = follows[tindex].elems[j].index;
 
+              if (dindex == tindex)
+                continue;
+
               if (follows[tindex].elems[j].constraint != iconstraint)
                 continue;
 
-- 
1.7.1

From 64333960e09591f27287eedecfd71a5f3dac8510 Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <nori...@kcn.ne.jp>
Date: Sun, 1 Nov 2020 16:24:15 +0900
Subject: [PATCH] tests: add the test for bugfix in gnulib's dfa

* tests/ere.tests: Add new test.
---
 tests/ere.tests |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tests/ere.tests b/tests/ere.tests
index 8ab6510..7d89508 100644
--- a/tests/ere.tests
+++ b/tests/ere.tests
@@ -218,3 +218,4 @@
 0@)@)
 1@)@x
 0@\()\((a\())(b))@()(a()b)
+1@a+a+a@aa
-- 
1.7.1

Reply via email to