The lexer fails to end a valid token when the lookahead character is beyond '\x7F'. For instance, input
true\xC2\xA2 produces the tokens JSON_ERROR true\xC2 JSON_ERROR \xA2 The first token should be JSON_KEYWORD true instead. The culprit is #define TERMINAL(state) [0 ... 0x7F] = (state) It leaves [0x80..0xFF] zero, i.e. IN_ERROR. Has always been broken. Fix it to initialize the complete array. Signed-off-by: Markus Armbruster <arm...@redhat.com> --- qobject/json-lexer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c index e1745a3d95..4867839f66 100644 --- a/qobject/json-lexer.c +++ b/qobject/json-lexer.c @@ -123,7 +123,7 @@ enum json_lexer_state { QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START_INTERP); QEMU_BUILD_BUG_ON(IN_START_INTERP != IN_START + 1); -#define TERMINAL(state) [0 ... 0x7F] = (state) +#define TERMINAL(state) [0 ... 0xFF] = (state) /* Return whether TERMINAL is a terminal state and the transition to it from OLD_STATE required lookahead. This happens whenever the table -- 2.17.1