On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote: > Henning_Thornblad wrote: > > What can be the cause of the large difference between re.search and > > grep? > > grep uses a smarter algorithm ;) > > > > > This script takes about 5 min to run on my computer: > > #!/usr/bin/env python > > import re > > > row="" > > for a in range(156000): > > row+="a" > > print re.search('[^ "=]*/',row) > > > While doing a simple grep: > > grep '[^ "=]*/' input (input contains 156.000 a in > > one row) > > doesn't even take a second. > > > Is this a bug in python? > > You could call this a performance bug, but it's not common enough in real > code to get the necessary brain cycles from the core developers. > So you can either write a patch yourself or use a workaround. > > re.search('[^ "=]*/', row) if "/" in row else None > > might be good enough. > > Peter
It is not a smarter algorithm that is used in grep. Python RE's have more capabilities than grep RE's which need a slower, more complex algorithm. You could argue that if the costly RE features are not used then maybe simpler, faster algorithms should be automatically swapped in but .... - Paddy. -- http://mail.python.org/mailman/listinfo/python-list