"Preben Randhol" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > What I first though was if there was possible to make a filter such as: > > Apples (apples) > (ducks) Ducks > (butter) g butter > > The data can be put in a hash table. > > Or maybe there are better ways? I generally want something that is > flexible so one can easily make a filter settings if the text file > format changes. >
Here is a simple filter builder using pyparsing. Pyparsing runs in two passes: first, to parse your filter patterns; then to use the generated grammar to parse some incoming source string. Pyparsing comes with a similar EBNF compiler, written by Seo Sanghyeon. I'm sorry this is not really a newbie example, but it does allow you to easily construct simple filters, and the implementation will give you something to chew on... :) Pyparsing wont be as fast as re's, but I cobbled this filter compiler together in about 3/4 of an hour, and may serve as a decent prototype for a more full-featured package. -- Paul Pyparsing's home Wiki is at http://pyparsing.wikispaces.com. ----------------- from pyparsing import * sourceText = """ Apples 34 56 Ducks Some more text. 0.5 g butter """ patterns = """\ Apples (apples) (ducks:%) Ducks (butter:#) g butter""" def compilePatternList(patternList, openTagChar="(", closeTagChar=")", greedy=True): def compileType(s,l,t): return { "%" : Word(nums+"-",nums).setName("integer"), "#" : Combine(Optional("-")+Word(nums)+"."+Optional(Word(nums))).setName("float"), "$" : Word(alphas).setName("alphabetic word"), "*" : Word(printables).setName("char-group") }[t[0]] backgroundWord = Word(alphanums).setParseAction(lambda s,l,t:Literal(t[0])) matchType = Optional(Suppress(":") + oneOf("% # $ *"),default="*").setParseAction(compileType) matchPattern = Combine(openTagChar + Word(alphas,alphanums).setResultsName("nam") + matchType.setResultsName("typ") + closeTagChar) matchPattern.setParseAction(lambda s,l,t: (t.typ).setResultsName(t.nam) ) patternGrammar = OneOrMore( backgroundWord | matchPattern ).setParseAction(lambda s,l,t:And([expr for expr in t])) patterns = [] for p in patternList: print p, pattExpr = patternGrammar.parseString(p)[0] print pattExpr patterns.append(pattExpr) altern = (greedy and Or or MatchFirst) return altern( patterns ) grammar = compilePatternList( patterns.split("\n") ) print grammar allResults = ParseResults([]) for t,s,e in grammar.scanString(sourceText): print t allResults += t print print allResults.keys() for k in allResults.keys(): print k,allResults[k] ----------------- Prints: Apples (apples) {"Apples" char-group} (ducks:%) Ducks {integer "Ducks"} (butter:#) g butter {float "g" "butter"} {{"Apples" char-group} ^ {integer "Ducks"} ^ {float "g" "butter"}} ['Apples', '34'] ['56', 'Ducks'] ['0.5', 'g', 'butter'] ['butter', 'apples', 'ducks'] butter 0.5 apples 34 ducks 56 -- http://mail.python.org/mailman/listinfo/python-list