On 2/28/2023 2:40 PM, David Raymond wrote:
With a slight tweak to the simple loop code using .find() it becomes a third
faster than the RE version though.
def using_simple_loop2(key, text):
matches = []
keyLen = len(key)
start = 0
while (foundSpot := text.find(key, start)) > -1:
start = foundSpot + keyLen
matches.append((foundSpot, start))
return matches
using_simple_loop: [0.1732664997689426, 0.1601669997908175,
0.15792609984055161, 0.1573973000049591, 0.15759290009737015]
using_re_finditer: [0.003412699792534113, 0.0032823001965880394,
0.0033694999292492867, 0.003354900050908327, 0.0033336998894810677]
using_simple_loop2: [0.00256159994751215, 0.0025471001863479614,
0.0025424999184906483, 0.0025831996463239193, 0.0025555999018251896]
On my system the difference is way bigger than that:
KEY = '''it doesn't matter, but in other cases it will.'''
using_simple_loop2: [0.0004955999902449548, 0.0004844000213779509,
0.0004862999776378274, 0.0004800999886356294, 0.0004792999825440347]
using_re_finditer: [0.002840900036972016, 0.0028330000350251794,
0.002701299963518977, 0.0028105000383220613, 0.0029977999511174858]
Shorter keys show the least differential:
KEY = 'in'
using_simple_loop2: [0.001983499969355762, 0.0019614999764598906,
0.0019617999787442386, 0.002027600014116615, 0.0020669000223279]
using_re_finditer: [0.002787900040857494, 0.0027620999608188868,
0.0027723999810405076, 0.002776700013782829, 0.002946800028439611]
Brilliant!
Python 3.10.9
Windows 10 AMD64 (build 10.0.19044) SP0
--
https://mail.python.org/mailman/listinfo/python-list