On Friday, August 19, 2016 at 10:09:19 PM UTC+8, Steve D'Aprano wrote: > On Fri, 19 Aug 2016 09:14 pm, iMath wrote: > > > > > for > > regex.search(string[, pos[, endpos]]) > > The optional parameter endpos is the index into the string beyond which > > the RE engine will not go, while this lead me to believe the RE engine > > will still search on till the endpos position even after it returned the > > matched object, is this Right ? > > No. > > Once the RE engine finds a match, it stops. You can test this for yourself > with a small timing test, using the "timeit" module. > > from timeit import Timer > huge_string = 'aaabc' + 'a'*1000000 + 'dea' > re1 = r'ab.a' > re2 = r'ad.a' > > # set up some code to time. > setup = 'import re; from __main__ import huge_string, re1, re2' > t1 = Timer('re.search(re1, huge_string)', setup) > t2 = Timer('re.search(re2, huge_string)', setup) > > # Now run the timers. > best = min(t1.repeat(number=1000))/1000 > print("Time to locate regex at the start of huge string:", best) > best = min(t2.repeat(number=1000))/1000 > print("Time to locate regex at the end of the huge string:", best) > > > > When I run that on my computer, it prints: > > Time to locate regex at the start of huge string: 4.9710273742675785e-06 > Time to locate regex at the end of the huge string: 0.0038938069343566893 > > > So it takes about 4.9 microseconds to find the regex at the beginning of the > string. To find the regex at the end of the string takes about 3893 > microseconds. > > > The "endpos" parameter tells the RE engine to stop at that position if the > regex isn't found before it. It won't go beyond that point. > > > > > > > -- > Steve > “Cheer up,” they said, “things could be worse.” So I cheered up, and sure > enough, things got worse.
Thanks for clarifying -- https://mail.python.org/mailman/listinfo/python-list