Serhiy Storchaka <storchaka+cpyt...@gmail.com> added the comment: It seems to me that regular expressions used in the lib2to3 version are more efficient but more complex.
$ ./python -m timeit -s 'import re; p = re.compile(r"0[bB](?:_?[01])+"); s = "0b"+"_0101"*16' 'p.match(s)' 100000 loops, best of 5: 2.45 usec per loop $ ./python -m timeit -s 'import re; p = re.compile(r"0[bB]_?[01]+(?:_[01]+)*"); s = "0b"+"_0101"*16' 'p.match(s)' 200000 loops, best of 5: 1.08 usec per loop $ ./python -m timeit -s 'import re; p = re.compile(r"0[xX](?:_?[0-9a-fA-F])+[lL]?"); s = "0x_0123_4567_89ab_cdef"' 'p.match(s)' 500000 loops, best of 5: 815 nsec per loop $ ./python -m timeit -s 'import re; p = re.compile(r"0[xX]_?[\da-fA-F]+(?:_[\da-fA-F]+)*[lL]?"); s = "0x_0123_4567_89ab_cdef"' 'p.match(s)' 500000 loops, best of 5: 542 nsec per loop Since the performance of lib2to3 is important, it is better to keep the current regexpes. But using \d in Python 3 is a bug, it should be replaced with [0-9]. This also speeds up the regex: $ ./python -m timeit -s 'import re; p = re.compile(r"0[xX]_?[0-9a-fA-F]+(?:_[0-9a-fA-F]+)*[lL]?"); s = "0x_0123_4567_89ab_cdef"' 'p.match(s)' 500000 loops, best of 5: 471 nsec per loop ---------- nosy: +serhiy.storchaka _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33338> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com