On Thu, 20 May 2010 10:50:41 -0500
Anthony Liguori <anth...@codemonkey.ws> wrote:

> On 05/20/2010 10:16 AM, Paolo Bonzini wrote:
> > On 05/20/2010 03:44 PM, Luiz Capitulino wrote:
> >>   I think there's another issue in the handling of strings.
> >>
> >>   The spec says that valid unescaped chars are in the following range:
> >>
> >>      unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
> 
> That's a spec bug IMHO.  Tab is %x09.  Surely you can include tabs in 
> strings.  Any parser that didn't accept that would be broken.

 Honestly, I had the impression this should be encoded as: %x5C %x74, but
if you're right, wouldn't this be true for other sequences as well?

> >>
> >>   But we do:
> >>
> >>      [IN_DQ_STRING] = {
> >>          [1 ... 0xFF] = IN_DQ_STRING,
> >>          ['\\'] = IN_DQ_STRING_ESCAPE,
> >>          ['"'] = IN_DONE_STRING,
> >>      },
> >>
> >>   Shouldn't we cover 0x20 .. 0xFF instead?
> >
> > If it's the lexer, isn't just it being liberal in what it accepts?
> 
> I believe the parser correctly rejects invalid UTF-8 sequences.

 Will check.

Reply via email to