Re: [Qemu-devel] [PATCH 07/11] qapi: qapi.py: allow the "'" character be escaped

Markus Armbruster Thu, 26 Jul 2012 04:22:29 -0700

Peter Maydell <peter.mayd...@linaro.org> writes:

> On 25 July 2012 20:18, Luiz Capitulino <lcapitul...@redhat.com> wrote:
>> Peter Maydell <peter.mayd...@linaro.org> wrote:
>>> On 25 July 2012 17:54, Luiz Capitulino <lcapitul...@redhat.com> wrote:
>>> > --- a/scripts/qapi.py
>>> > +++ b/scripts/qapi.py
>>> > @@ -21,7 +21,9 @@ def tokenize(data):
>>> >          elif data[0] == "'":
>>> >              data = data[1:]
>>> >              string = ''
>>> > -            while data[0] != "'":
>>> > +            while True:
>>> > +                if data[0] == "'" and string[len(string)-1] != "\\":
>>> > +                    break
>>> >                  string += data[0]
>>> >                  data = data[1:]
>>> >              data = data[1:]
>>>
>>> Won't this cause us to look at string[-1] if
>>> the input data has two ' characters in a row?
>>
>> Non escaped? If you meant '' that's a zero length string and should work, but
>> if you meant 'foo '' bar' that's illegal, because ' characters
>> should be escaped.
>
> I meant the zero length string case. yes. We come in with data = "''",
> strip the first ' and set string to empty. Then in the first time
> in the while loop data[0] is "'" but len(string) is 0 and so we'll
> do string[-1] which I think will throw an exception.
>
> ...and yep, quick test of a nobbbled qapi-schema.json confirms:
> $ python /home/pm215/src/qemu/qemu/scripts/qapi-types.py -h -o "." <
> /home/pm215/src/qemu/qemu/qapi-schema.json
> Traceback (most recent call last):
>   File "/home/pm215/src/qemu/qemu/scripts/qapi-types.py", line 260, in 
> <module>
>     exprs = parse_schema(sys.stdin)
>   File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 78, in parse_schema
>     expr_eval = evaluate(expr)
>   File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 64, in evaluate
>     return parse(map(lambda x: x, tokenize(string)))[0]
>   File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 25, in tokenize
>     if data[0] == "'" and string[len(string)-1] != "\\":
> IndexError: string index out of range
>
> Try this (very lightly tested but seems to work):
> (feel free to do something nicer than raising an exception on
> the syntax error, and sorry I'm feeling too lazy to make this
> an actual patch email)
>
> Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
>
> --- a/scripts/qapi.py
> +++ b/scripts/qapi.py
> @@ -21,10 +21,16 @@ def tokenize(data):
>          elif data[0] == "'":
>              data = data[1:]
>              string = ''
> -            while data[0] != "'":
> -                string += data[0]
> -                data = data[1:]
> -            data = data[1:]
> +            while True:
> +                pos = data.find("'")
> +                if pos == -1:
> +                    raise Exception("Mismatched quotes")
> +                string += data[0:pos]
> +                data = data[pos+1:]
> +                if len(string) == 0 or string[-1] != "\\":
> +                    # found a ' and it wasn't escaped
> +                    break
> +                string = string[0:-1] + "'"
>              yield string
>
>  def parse(tokens):
>
> (if anybody wants to be able to use '\\' to escape escapes then
> this approach is a bit stuffed, of course.)


For what it's worth, the orthodox way to lexically analyze strings is a
finite automaton.  Utterly untested sketch:

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 8082af3..a745e92 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -21,8 +21,17 @@ def tokenize(data):
         elif data[0] == "'":
             data = data[1:]
             string = ''
-            while data[0] != "'":
-                string += data[0]
+            esc = False
+            while True:
+                if esc:
+                    string += data[0]
+                    esc = False
+                elif data[0] == "\\":
+                    esc = True
+                elif data[0] == "'":
+                    break
+                else
+                    string += data[0]
                 data = data[1:]
             data = data[1:]
             yield string

Doesn't handle missing close quote gracefully; you may want to add that.

>> PS: Peter, I get claustrophobic when reading emails from you :)
>
> I can add more blank lines if that helps? :-)
>
> -- PMM

Re: [Qemu-devel] [PATCH 07/11] qapi: qapi.py: allow the "'" character be escaped

Reply via email to