Kent Johnson wrote: > Here is a program that scans a string for test chars, either using a single > regex search or by individually searching for the test chars. The test data > set doesn't include any of the test chars so it is a worst case (neither scan > terminates early): > > # FindAny.py > import re, string > > data = string.letters * 2500 > > testchars = string.digits + string.whitespace > testRe = re.compile('[' + testchars + ']') > > def findRe(): > return testRe.search(data) is not None > > def findScan(): > for c in testchars: > if c in data: > return True > return False > > > and here are the results of timing calls to findRe() and findScan(): > > F:\Tutor>python -m timeit -s "from FindAny import findRe, findScan" "findRe()" > 100 loops, best of 3: 2.29 msec per loop > > F:\Tutor>python -m timeit -s "from FindAny import findRe, findScan" > "findScan()" > 100 loops, best of 3: 2.04 msec per loop > > Surprised the heck out of me!
On the other hand, if the number of chars you are searching for is large (and unicode?) the regex solution wins: # FindAny.py (new version) # From a recipe by Martin v. Löwis, who claims that this regex match is "time independent of the number of characters in the class" # http://tinyurl.com/7jqgt import re, string, sys data = string.digits * 2500 data = data.decode('ascii') uppers = [] for i in range(sys.maxunicode): c = unichr(i) if c.isupper(): uppers.append(c) uppers = u"".join(uppers) uppers_re = re.compile(u'[' + uppers + u']') def findRe(): return uppers_re.search(data) is not None def findScan(): for c in uppers: if c in data: return True return False F:\Tutor>python -m timeit -s "from FindAny import findRe, findScan" "findRe()" 1000 loops, best of 3: 442 usec per loop F:\Tutor>python -m timeit -s "from FindAny import findRe, findScan" "findScan()" 10 loops, best of 3: 36 msec per loop Now the search solution takes 80 times as long as the regex! I'm-a-junkie-for-timings-ly-yrs, Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor