> I wrote my previous message before reading this. Thank you for the test you > ran -- it answers the question of performance. You show that re.finditer is > 30x faster, so that certainly recommends that over a simple loop, which > introduces looping overhead.
>> def using_simple_loop(key, text): >> matches = [] >> for i in range(len(text)): >> if text[i:].startswith(key): >> matches.append((i, i + len(key))) >> return matches >> >> using_simple_loop: [0.13952950000020792, 0.13063130000000456, >> 0.12803450000001249, 0.13186180000002423, 0.13084610000032626] >> using_re_finditer: [0.003861400000005233, 0.004061900000124297, >> 0.003478999999970256, 0.003413100000216218, 0.0037320000001273] With a slight tweak to the simple loop code using .find() it becomes a third faster than the RE version though. def using_simple_loop2(key, text): matches = [] keyLen = len(key) start = 0 while (foundSpot := text.find(key, start)) > -1: start = foundSpot + keyLen matches.append((foundSpot, start)) return matches using_simple_loop: [0.1732664997689426, 0.1601669997908175, 0.15792609984055161, 0.1573973000049591, 0.15759290009737015] using_re_finditer: [0.003412699792534113, 0.0032823001965880394, 0.0033694999292492867, 0.003354900050908327, 0.0033336998894810677] using_simple_loop2: [0.00256159994751215, 0.0025471001863479614, 0.0025424999184906483, 0.0025831996463239193, 0.0025555999018251896] -- https://mail.python.org/mailman/listinfo/python-list