On 7/19/07, Gordon Airporte <[EMAIL PROTECTED]> wrote:
I have some code which relies on running each line of a file through a large number of regexes which may or may not apply. For each pattern I want to match I've been writing gotit = mypattern.findall(line)
Try to use iterator function finditer instead of findall. To see the difference run below code by calling findIter or findAll function one at a time in for loop. You can have achieve atleast 4x better performance. ----------------------------------------------------------------------------------- import re import time m = re.compile(r'(\d+/\d+/\d+)') line = "Today's date is 21/07/2007 then yesterday's 20/07/2007" def findIter(line): m.finditer(line) glist = [x.group(0) for x in g] def findAll(line): glist = m.findall(line) start = time.time() for i in xrange(1000000): #findIter(line) findAll(line) end = time.time() print end-start --------------------------------------------------------------------------------------------------------
-- http://mail.python.org/mailman/listinfo/python-list