John Nagle wrote: > Henning_Thornblad wrote: >> What can be the cause of the large difference between re.search and >> grep? >> >> This script takes about 5 min to run on my computer: >> #!/usr/bin/env python >> import re >> >> row="" >> for a in range(156000): >> row+="a" >> print re.search('[^ "=]*/',row) >> >> >> While doing a simple grep: >> grep '[^ "=]*/' input (input contains 156.000 a in >> one row) >> doesn't even take a second. >> >> Is this a bug in python? >> >> Thanks... >> Henning Thornblad > > You're recompiling the regular expression on each use. > Use "re.compile" before the loop to do it once.
Now that's premature optimization :-) Apart from the fact that re.search() is executed only once in the above script the re library uses a caching scheme so that even if the re.search() call were in a loop the overhead would be a few microseconds for the cache lookup. Peter -- http://mail.python.org/mailman/listinfo/python-list