On 08/27/2018 02:00 AM, Markus Armbruster wrote:
The lexer fails to end a valid token when the lookahead character is
beyond '\x7F'.  For instance, input

     true\xC2\xA2

produces the tokens

     JSON_ERROR     true\xC2
     JSON_ERROR     \xA2

The first token should be

     JSON_KEYWORD   true

instead.

As long as we still get a JSON_ERROR in the end.


The culprit is

     #define TERMINAL(state) [0 ... 0x7F] = (state)

It leaves [0x80..0xFF] zero, i.e. IN_ERROR.  Has always been broken.

I wonder if that was done because it was assuming that valid input is only ASCII, and that any byte larger than 0x7f is invalid except within the context of a string. But whatever the reason for the original bug, your fix makes sense.

Fix it to initialize the complete array.

Worth testsuite coverage?


Signed-off-by: Markus Armbruster <arm...@redhat.com>
---
  qobject/json-lexer.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Eric Blake <ebl...@redhat.com>

diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index e1745a3d95..4867839f66 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -123,7 +123,7 @@ enum json_lexer_state {
  QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START_INTERP);
  QEMU_BUILD_BUG_ON(IN_START_INTERP != IN_START + 1);
-#define TERMINAL(state) [0 ... 0x7F] = (state)
+#define TERMINAL(state) [0 ... 0xFF] = (state)
/* Return whether TERMINAL is a terminal state and the transition to it
     from OLD_STATE required lookahead.  This happens whenever the table


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Reply via email to