On 9/14/20 12:13 AM, Norihiro Tanaka wrote:
when (i >= d->follows[i].elems[j].index), it seems that
map[d->follows[i].elems[j].index] has been already set a value more than 0.
What case violates this assumption?
Thank you for looking into this. I ran into the problem with the
dfa-heap-overrun test:
grep -E '(^| )*(a|b)*(c|d)*( |$)' < /dev/null
I can reproduce the problem by applying the attached patch to current dfa.c.
This patch brings back the previous algorithm, except with a runtime test of the
assumption. If I then run the dfa-heap-overrun test, it dumps core on my
platform (Ubuntu 18.04.5 x86-64, en_US.utf8 locale) because the assumption is
violated.
diff --git a/lib/dfa.c b/lib/dfa.c
index 746c7b568..5aff2f607 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2505,11 +2505,6 @@ reorder_tokens (struct dfa *d)
? xnmalloc (d->nleaves, sizeof *multibyte_prop)
: NULL);
- for (idx_t i = 0; i < d->tindex; i++)
- for (idx_t j = 0; j < d->follows[i].nelem; j++)
- if (map[d->follows[i].elems[j].index] < 0)
- map[d->follows[i].elems[j].index] = nleaves++;
-
for (idx_t i = 0; i < d->tindex; i++)
{
if (map[i] < 0)
@@ -2528,7 +2523,16 @@ reorder_tokens (struct dfa *d)
multibyte_prop[map[i]] = d->multibyte_prop[i];
for (idx_t j = 0; j < d->follows[i].nelem; j++)
- d->follows[i].elems[j].index = map[d->follows[i].elems[j].index];
+ {
+ if (map[d->follows[i].elems[j].index] == -1)
+ {
+ if (d->follows[i].elems[j].index < i)
+ abort ();
+ map[d->follows[i].elems[j].index] = nleaves++;
+ }
+
+ d->follows[i].elems[j].index = map[d->follows[i].elems[j].index];
+ }
qsort (d->follows[i].elems, d->follows[i].nelem,
sizeof *d->follows[i].elems, compare);