Matthew Barnett added the comment:

It appears that in your tests Python 3.2 is faster with Unicode than 
bytestrings and that unpatched Python 3.4 is a lot slower.

I get somewhat different results (Windows XP Pro, 32-bit):

C:\Python32\python.exe -m timeit -s "import re; f = re.compile(b'abc').search; 
x = b'x'*100000" "f(x)"
1000 loops, best of 3: 449 usec per loop

C:\Python32\python.exe -m timeit -s "import re; f = re.compile('abc').search; x 
= 'x'*100000" "f(x)"
1000 loops, best of 3: 506 usec per loop

C:\Python32\python.exe -m timeit -s "import re; f = re.compile('abc').search; x 
= '\u20ac'*100000" "f(x)"
1000 loops, best of 3: 506 usec per loop


C:\Python34\python.exe -m timeit -s "import re; f = re.compile(b'abc').search; 
x = b'x'*100000" "f(x)"
1000 loops, best of 3: 227 usec per loop

C:\Python34\python.exe -m timeit -s "import re; f = re.compile('abc').search; x 
= 'x'*100000" "f(x)"
1000 loops, best of 3: 339 usec per loop

C:\Python34\python.exe -m timeit -s "import re; f = re.compile('abc').search; x 
= '\u20ac'*100000" "f(x)"
1000 loops, best of 3: 504 usec per loop

For comparison, in the regex module I don't duplicate whole sections of code, 
but instead have a pointer to one of 3 functions (for UCS1, UCS2 and UCS4) that 
gets the codepoint, except for some tight loops. Doing that might be too much 
of a change for re.

However, the speed appears to be a lot more consistent:

C:\Python32\python.exe -m timeit -s "import regex; f = 
regex.compile(b'abc').search; x = b'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop

C:\Python32\python.exe -m timeit -s "import regex; f = 
regex.compile('abc').search; x = 'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop

C:\Python32\python.exe -m timeit -s "import regex; f = 
regex.compile('abc').search; x = '\u20ac'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop


C:\Python34\python.exe -m timeit -s "import regex; f = 
regex.compile(b'abc').search; x = b'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop

C:\Python34\python.exe -m timeit -s "import regex; f = 
regex.compile('abc').search; x = 'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop

C:\Python34\python.exe -m timeit -s "import regex; f = 
regex.compile('abc').search; x = '\u20ac'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18685>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to