If a state has neither ANYCHAR nor MBCSET and next character is eolbyte,
the next state is -1.  So exit loop, checked whether a position is end
of buffer or not.

However, if a state has either ANYCHAR or MBCSET, even if next character
is eolbyte, next state mayn't be -1.  So we must check whether a
position is end of buffer or not, otherwise may run over the buffer.
From 5064483820986ce9a58633be878910c5764070da Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <nori...@kcn.ne.jp>
Date: Mon, 29 Sep 2014 08:53:56 +0900
Subject: [PATCH] dfa: check end of an input buffer after a transition in
 non-UTF8 multibyte locales

* src/dfa.c (dfaexec_main): Check end of an input buffer after a
transition in non-UTF8 multibyte locales.
---
 src/dfa.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/src/dfa.c b/src/dfa.c
index 4f45fff..b654a54 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -3351,6 +3351,21 @@ dfaexec_main (struct dfa *d, char const *begin, char 
*end,
               /* Can match with a multibyte character (and multi character
                  collating element).  Transition table might be updated.  */
               s = transit_state (d, s, &p, (unsigned char *) end);
+
+              if (p[-1] == eol)
+                {
+                  if ((char *) p > end)
+                    {
+                      p = NULL;
+                      goto done;
+                    }
+
+                  nlcount++;
+
+                  if (!allow_nl)
+                    s = 0;
+                }
+
               mbp = p;
               trans = d->trans;
             }
-- 
2.1.1

Reply via email to