On Jul 26, 3:27 pm, Neil Cerutti <[EMAIL PROTECTED]> wrote: > > Hopefully I'll have time to help you a bit more later, or Paul > MaGuire will swoop down in his pyparsing powered super-suit. ;) > There's no need to fear...!
Neil was dead on, and your parser is almost exactly right. Congratulations for delving into the arcane Dict class, not an easy element for first time pyparsers! Forward() would have been okay if you were going to have your macros defined in advance of referencing them. However, since you are defining them after-the-fact, you'll have to wait until all the text is parsed into a tree to start doing the macro substitution. Your grammar as-is was almost exactly right (I've shown the minimal mod needed to make this work, plus an alternative grammar that might be a bit neater-looking). To perform some work after the tree is built, you attach a parse action to the top-level doc element. This parse action's job is to begin with the "Start" element, and recursively replace words in all caps with their corresponding substitution. As you surmised, the Dict class automagically builds the lookup dictionary for you during the parsing phase. After the parse action runse, the resulting res["Start"] element gives the desired results. (This looks vaguely YAML-ish, am I right?) -- Paul Here is your working code: from pyparsing import Word, Optional, OneOrMore, Group, alphas, \ alphanums, Suppress, Dict, Combine, delimitedList, traceParseAction, \ ParseResults import string def allIn( as, members ): "Tests that all elements of as are in members""" for a in as: if a not in members: return False return True def allUpper( as ): """Tests that all strings in as are uppercase""" return allIn( as, string.uppercase ) def getItems(myArray, myDict): """Recursively get the items for each CAPITAL word""" myElements=[] for element in myArray: myWords=[] for word in element: if allUpper(word): items = getItems(myDict[word], myDict) myWords.append(items) else: myWords.append(word) myElements.append(myWords) return myElements testData = """ :Start: first SECOND THIRD fourth FIFTH :SECOND: second1_1 second1_2 | second2 | second3 :THIRD: third1 third2 | SIXTH :FIFTH: fifth1 | SEVENTH :SIXTH: sixth1_1 sixth1_2 | sixth2 :SEVENTH: EIGHTH | seventh1 :EIGHTH: eighth1 | eighth2 """ #> original grammar - very close! #> just needed to enclose definition of data in a Group label = Suppress(":") + Word(alphas + "_") + Suppress(":") words = Group(OneOrMore(Word(alphanums + "_"))) + \ Suppress(Optional("|")) #~ data = ~label + OneOrMore(words) data = Group( OneOrMore(words) ) line = Group(label + data) doc = Dict(OneOrMore(line)) #> suggested alternative grammar #> - note use of Combine and delimitedList #~ COLON = Suppress(":") #~ label = Combine( COLON + Word(alphas + "_") + COLON ) #~ entry = Word(alphanums + "_") #~ data = delimitedList( Group(OneOrMore(entry)), delim="|" ) #~ line = Group(label + data) #~ doc = Dict(OneOrMore(line)) # recursive reference fixer-upper def fixupRefsRecursive(tokens, lookup): if isinstance(tokens, ParseResults): subs = [ fixupRefsRecursive(t, lookup) for t in tokens ] tokens = ParseResults( subs ) else: if tokens.isupper(): tokens = fixupRefsRecursive(lookup[tokens], lookup) return tokens #> add this parse action to doc, which invokes recursive #> reference fixer-upper def fixupRefs(tokens): tokens["Start"] = fixupRefsRecursive( tokens["Start"], tokens ) doc.setParseAction( fixupRefs ) res = doc.parseString(testData) # This prints out what pyparser gives us #~ for line in res: #~ print line #> not really interested in all of res, just the fixed-up #> "Start" entry print res["Start"][0].asList() print startString = res["Start"] items = getItems([startString], res)[0] # This prints out what we want for line in items: print line Prints: ['first', [['second1_1', 'second1_2'], ['second2'], ['second3']], [['third1', 'third2'], [[['sixth1_1', 'sixth1_2'], ['sixth2']]]], 'fourth', [['fifth1'], [[[[['eighth1'], ['eighth2']]], ['seventh1']]]]] ['first', [['second1_1', 'second1_2'], ['second2'], ['second3']], [['third1', 'third2'], [[['sixth1_1', 'sixth1_2'], ['sixth2']]]], 'fourth', [['fifth1'], [[[[['eighth1'], ['eighth2']]], ['seventh1']]]]] -- http://mail.python.org/mailman/listinfo/python-list