Re: [Qemu-devel] [PATCH v2 41/60] json: Nicer recovery from invalid leading zero

Markus Armbruster Mon, 20 Aug 2018 22:11:49 -0700

Eric Blake <ebl...@redhat.com> writes:

> On 08/20/2018 06:39 AM, Markus Armbruster wrote:
>
>> In review of v1, we discussed whether to try matching non-integer
>> numbers with redundant leading zero.  Doing that tightly in the lexer
>> requires duplicating six states.  A simpler alternative is to have the
>> lexer eat "digit salad" after redundant leading zero: 0[0-9.eE+-]+.
>> Your suggestion for hexadecimal numbers is digit salad with different
>> digits: [0-9a-fA-FxX].  Another option is their union: [0-9a-fA-FxX+-].
>> Even more radical would be eating anything but whitespace and structural
>> characters: [^][}{:, \t\n\r].  That idea pushed to the limit results in
>> a two-stage lexer: first stage finds token strings, where a token string
>> is a structural character or a sequence of non-structural,
>> non-whitespace characters, second stage rejects invalid token strings.
>>
>> Hmm, we could try to recover from lexical errors more smartly in
>> general: instead of ending the JSON error token after the first
>> offending character, end it before the first whitespace or structural
>> character following the offending character.
>>
>> I can try that, but I'd prefer to try it in a follow-up patch.
>
> Indeed, that sounds like a valid approach. So, for this patch, I'm
> fine with just accepting ['0' ... '9'], then seeing if the later
> smarter-lexing change makes back-to-back non-structural tokens give
> saner error messages in general.


I think I'll drop this patch for now.  It's not useful enough to apply
it now, then revert it when we have the more general error recovery
improvement.

Re: [Qemu-devel] [PATCH v2 41/60] json: Nicer recovery from invalid leading zero

Reply via email to