On 7/19/07, Gordon Airporte <[EMAIL PROTECTED]> wrote:

I have some code which relies on running each line of a file through a
large number of regexes which may or may not apply. For each pattern I
want to match I've been writing

gotit = mypattern.findall(line)



Try to use iterator function finditer instead of findall. To see the
difference run below code by calling findIter or findAll function one at a
time in for loop.  You can have achieve atleast 4x better performance.

-----------------------------------------------------------------------------------
import re
import time

m = re.compile(r'(\d+/\d+/\d+)')
line = "Today's date is 21/07/2007 then yesterday's  20/07/2007"

def findIter(line):
   m.finditer(line)
   glist = [x.group(0) for x in g]

def findAll(line):
   glist = m.findall(line)

start = time.time()
for i in xrange(1000000):
   #findIter(line)
   findAll(line)
end = time.time()

print end-start

--------------------------------------------------------------------------------------------------------
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to