On Jul 2, 3:56 pm, Neil Cerutti <[EMAIL PROTECTED]> wrote: > On 2007-07-02, Laurent Pointal <[EMAIL PROTECTED]> wrote: > > > Neil Cerutti wrote: > >> How can I make the Python more idiomatic Python? > > > Have you taken a look at pyparsing ? > > Yes, I have it. PyParsing has, well, so many convenience features > they seem to shout down whatever the core features are, and I > don't know quite how to get started as a result. > > Hardest of all was modifying a working PyParsing program. > > As a result, I've found writing my own recursive descent parsers > much easier. > > I'm probably wrong, though. ;) > > -- > Neil Cerutti
from pyparsing import * """ Neil - Ok, here is the step-by-step, beginning with your posted BNF. (Based on your test cases, I think the '{}'s are really supposed to be '()'s.) ; <WAE> ::= ; <num> ; | { + <WAE> <WAE> } ; | { - <WAE> <WAE> } ; | {with {<id> <WAE>} <WAE>} ; | <id> The most basic building blocks in pyparsing are Literal and Word. With these, you compose "compound" elements using And and MatchFirst, which are bound to the operators '+' and '|' (on occasion, Or is required, bound to operator '^', but not for this simple parser). Since you have a recursive grammar, you will also need Forward. Whitespace is skipped implicitly. Only slightly more advanced is the Group class, which will impart hierarchy and structure to the results - otherwise, everything just comes out as one flat list of tokens. You may be able to remove these in the final parser, depending on your results after steps 1 and 2 in the "left for the student" part below, but they are here to help show structure of the parsed tokens. As convenience functions go, I think the most common are oneOf and delimitedList. oneOf might be useful here if you want to express id as a single-char variable; otherwise, just use Word(alphas). At this point you should be able to write a parser for this WAE grammar. Like the following 9-liner: """ LPAR = Literal("(").suppress() RPAR = Literal(")").suppress() wae = Forward() num = Word(nums) id = oneOf( list(alphas) ) addwae = Group( LPAR + "+" + wae + wae + RPAR ) subwae = Group( LPAR + "-" + wae + wae + RPAR ) withwae = Group( LPAR + "with" + LPAR + id + wae + RPAR + wae + RPAR ) wae << (addwae | subwae | withwae | num | id) tests = """\ 3 (+ 3 4) (with (x (+ 5 5)) (+ x x))""".splitlines() for t in tests: print t waeTree = wae.parseString(t) print waeTree.asList() print """ If you extract and run this script, here are the results: 3 ['3'] (+ 3 4) [['+', '3', '4']] (with (x (+ 5 5)) (+ x x)) [['with', 'x', ['+', '5', '5'], ['+', 'x', 'x']]] Left as an exercise for the student: 1. Define classes NumWAE, IdWAE, AddWAE, SubWAE, and WithWAE whose __init__ methods take a ParseResults object named tokens (which you can treat as a list of tokens), and each with a calc() method to evaluate them accordingly. 2. Hook each class to the appropriate WAE class using setParseAction. Hint: here is one done for you: num.setParseAction(NumWAE) 3. Modify the test loop to insert an evaluation of the parsed tree. Extra credit: why is id last in the set of alternatives defined for the wae expression? -- Paul """ -- http://mail.python.org/mailman/listinfo/python-list