On 9/14/20 12:13 AM, Norihiro Tanaka wrote:

when (i >= d->follows[i].elems[j].index), it seems that
map[d->follows[i].elems[j].index] has been already set a value more than 0.

What case violates this assumption?

Thank you for looking into this. I ran into the problem with the dfa-heap-overrun test:

grep -E '(^| )*(a|b)*(c|d)*( |$)' < /dev/null

I can reproduce the problem by applying the attached patch to current dfa.c. This patch brings back the previous algorithm, except with a runtime test of the assumption. If I then run the dfa-heap-overrun test, it dumps core on my platform (Ubuntu 18.04.5 x86-64, en_US.utf8 locale) because the assumption is violated.
diff --git a/lib/dfa.c b/lib/dfa.c
index 746c7b568..5aff2f607 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2505,11 +2505,6 @@ reorder_tokens (struct dfa *d)
                           ? xnmalloc (d->nleaves, sizeof *multibyte_prop)
                           : NULL);
 
-  for (idx_t i = 0; i < d->tindex; i++)
-    for (idx_t j = 0; j < d->follows[i].nelem; j++)
-      if (map[d->follows[i].elems[j].index] < 0)
-        map[d->follows[i].elems[j].index] = nleaves++;
-
   for (idx_t i = 0; i < d->tindex; i++)
     {
       if (map[i] < 0)
@@ -2528,7 +2523,16 @@ reorder_tokens (struct dfa *d)
         multibyte_prop[map[i]] = d->multibyte_prop[i];
 
       for (idx_t j = 0; j < d->follows[i].nelem; j++)
-        d->follows[i].elems[j].index = map[d->follows[i].elems[j].index];
+        {
+          if (map[d->follows[i].elems[j].index] == -1)
+            {
+              if (d->follows[i].elems[j].index < i)
+                abort ();
+              map[d->follows[i].elems[j].index] = nleaves++;
+            }
+
+          d->follows[i].elems[j].index = map[d->follows[i].elems[j].index];
+        }
 
       qsort (d->follows[i].elems, d->follows[i].nelem,
              sizeof *d->follows[i].elems, compare);

Reply via email to