Paul McGuire wrote: > If you're absolutely stuck on using RE's, then others will have to step > forward. Meanwhile, here's a pyparsing solution (get pyparsing at > http://pyparsing.sourceforge.net):
so, let's see. using ... from pyparsing import * import re data = """ ... table example from op ... """ def test1(): LT = Literal("<") GT = Literal(">") collapsableSpace = GT + LT collapsableSpace.setParseAction( replaceWith("><") ) return collapsableSpace.transformString(data) def test2(): return re.sub(">\s+<", "><", data) I get > timeit -s "import test" "test.test1()" 100 loops, best of 3: 6.8 msec per loop > timeit -s "import test" "test.test2()" 10000 loops, best of 3: 33.3 usec per loop or in other words, five lines instead of one, and a 200x slowdown. but alright, maybe we should precompile the expressions to get a fair comparision. adding LT = Literal("<") GT = Literal(">") collapsableSpace = GT + LT collapsableSpace.setParseAction( replaceWith("><") ) def test3(): return collapsableSpace.transformString(data) p = re.compile(">\s+<") def test4(): return p.sub("><", data) to the first program, I get > timeit -s "import test" "test.test3()" 100 loops, best of 3: 6.73 msec per loop > timeit -s "import test" "test.test4()" 10000 loops, best of 3: 27.8 usec per loop that's a 240x slowdown. hmm. </F> -- http://mail.python.org/mailman/listinfo/python-list