David, Your results suggest we need to be reminded that lots depends on other factors. There are multiple versions/implementations of python out there including some written in C but also other underpinnings. Each can often have sections of pure python code replaced carefully with libraries of compiled code, or not. So your results will vary.
Just as an example, assume you derive a type of your own as a subclass of str and you over-ride the find method by writing it in pure python using loops and maybe add a few bells and whistles. If you used your improved algorithm using this variant of str, might it not be quite a bit slower? Imagine how much slower if your improvement also implemented caching and logging and the option of ignoring case which are not really needed here. This type of thing can happen in many other scenarios and some module may be shared that is slow and a while later is updated but not everyone installs the update so performance stats can vary wildly. Some people advocate using some functional programming tactics, in various languages, partially because the more general loops are SLOW. But that is largely because some of the functional stuff is a compiled function that hides the loops inside a faster environment than the interpreter. -----Original Message----- From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On Behalf Of David Raymond Sent: Tuesday, February 28, 2023 2:40 PM To: python-list@python.org Subject: RE: How to escape strings for re.finditer? > I wrote my previous message before reading this. Thank you for the test you ran -- it answers the question of performance. You show that re.finditer is 30x faster, so that certainly recommends that over a simple loop, which introduces looping overhead. >> def using_simple_loop(key, text): >> matches = [] >> for i in range(len(text)): >> if text[i:].startswith(key): >> matches.append((i, i + len(key))) >> return matches >> >> using_simple_loop: [0.13952950000020792, 0.13063130000000456, 0.12803450000001249, 0.13186180000002423, 0.13084610000032626] >> using_re_finditer: [0.003861400000005233, 0.004061900000124297, 0.003478999999970256, 0.003413100000216218, 0.0037320000001273] With a slight tweak to the simple loop code using .find() it becomes a third faster than the RE version though. def using_simple_loop2(key, text): matches = [] keyLen = len(key) start = 0 while (foundSpot := text.find(key, start)) > -1: start = foundSpot + keyLen matches.append((foundSpot, start)) return matches using_simple_loop: [0.1732664997689426, 0.1601669997908175, 0.15792609984055161, 0.1573973000049591, 0.15759290009737015] using_re_finditer: [0.003412699792534113, 0.0032823001965880394, 0.0033694999292492867, 0.003354900050908327, 0.0033336998894810677] using_simple_loop2: [0.00256159994751215, 0.0025471001863479614, 0.0025424999184906483, 0.0025831996463239193, 0.0025555999018251896] -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list