Re: [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F'

Markus Armbruster Mon, 27 Aug 2018 21:29:05 -0700

Eric Blake <ebl...@redhat.com> writes:

> On 08/27/2018 02:00 AM, Markus Armbruster wrote:
>> The lexer fails to end a valid token when the lookahead character is
>> beyond '\x7F'.  For instance, input
>>
>>      true\xC2\xA2
>>
>> produces the tokens
>>
>>      JSON_ERROR     true\xC2
>>      JSON_ERROR     \xA2
>>
>> The first token should be
>>
>>      JSON_KEYWORD   true
>>
>> instead.
>
> As long as we still get a JSON_ERROR in the end.


We do: one for \xC2, and one for \xA2.  PATCH 4 will lose the second one.

>> The culprit is
>>
>>      #define TERMINAL(state) [0 ... 0x7F] = (state)
>>
>> It leaves [0x80..0xFF] zero, i.e. IN_ERROR.  Has always been broken.
>
> I wonder if that was done because it was assuming that valid input is
> only ASCII, and that any byte larger than 0x7f is invalid except
> within the context of a string.

Plausible thinko.

>                                  But whatever the reason for the
> original bug, your fix makes sense.
>
>> Fix it to initialize the complete array.
>
> Worth testsuite coverage?

Since lookahead bytes > 0x7F are always a parse error, all the bug can
do is swallow a TERMINAL() token right before a parse error.  The
TERMINAL() tokens are JSON_INTEGER, JSON_FLOAT, JSON_KEYWORD, JSON_SKIP,
JSON_INTERP.  Fairly harmless.  In particular, JSON objects get through
even when followed by a byte > 0x7F.

Of course, test coverage wouldn't hurt regardless.

>> Signed-off-by: Markus Armbruster <arm...@redhat.com>
>> ---
>>   qobject/json-lexer.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>
> Reviewed-by: Eric Blake <ebl...@redhat.com>

Thanks!

Re: [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F'

Reply via email to