On Nov 14, 5:41 pm, "Sam Pointon" <[EMAIL PROTECTED]> wrote: > On Nov 14, 7:56 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > wrote: > > > Hi, I'm looking for something like: > > > multi_split( 'a:=b+c' , [':=','+'] ) > > > returning: > > ['a', ':=', 'b', '+', 'c'] > > > whats the python way to achieve this, preferably without regexp? > > pyparsing <http://pyparsing.wikispaces.com/> is quite a cool package > for doing this sort of thing.
Thanks for mentioning pyparsing, Sam! This is a good example of using pyparsing for just basic tokenizing, and it will do a nice job of splitting up the tokens, whether there is whitespace or not. For instance, if you were tokenizing using the string split() method, you would get nice results from "a := b + c", but not so good from "a:= b+ c". Using Sam Pointon's simple pyparsing expression, you can split up the arithmetic using the symbol expressions, and the whitespace is pretty much ignored. But pyparsing can be used for more than just tokenizing. Here is a slightly longer pyparsing example, using a new pyparsing helper method called operatorPrecedence, which can shortcut the definition of operator-separated expressions with () grouping. Note how this not only tokenizes the expression, but also identifies the implicit groups based on operator precedence. Finally, pyparsing allows you to label the parsed results - in this case, you can reference the LHS and RHS sides of your assignment statement using the attribute names "lhs" and "rhs". This can really be handy for complicated grammars. -- Paul from pyparsing import * number = Word(nums) variable = Word(alphas) operand = number | variable arithexpr = operatorPrecedence( operand, [("!", 1, opAssoc.LEFT), # factorial ("^", 2, opAssoc.RIGHT), # exponentiation (oneOf('+ -'), 1, opAssoc.RIGHT), # leading sign (oneOf('* /'), 2, opAssoc.LEFT), # multiplication (oneOf('+ -'), 2, opAssoc.LEFT),] # addition ) assignment = (variable.setResultsName("lhs") + ":=" + arithexpr.setResultsName("rhs")) test = ["a:= b+c", "a := b + -c", "y := M*X + B", "e := m * c^2",] for t in test: tokens = assignment.parseString(t) print tokens.asList() print tokens.lhs, "<-", tokens.rhs print Prints: ['a', ':=', ['b', '+', 'c']] a <- ['b', '+', 'c'] ['a', ':=', ['b', '+', ['-', 'c']]] a <- ['b', '+', ['-', 'c']] ['y', ':=', [['M', '*', 'X'], '+', 'B']] y <- [['M', '*', 'X'], '+', 'B'] ['e', ':=', ['m', '*', ['c', '^', 2]]] e <- ['m', '*', ['c', '^', 2]] -- http://mail.python.org/mailman/listinfo/python-list