Serhiy Storchaka added the comment:

> 1) Do you know if anybody maintains a patched version of the Python code 
> anywhere?  I could put a package up on github/PyPI, if not.

Sorry, perhaps I misunderstood you. There are unofficial mirrors of CPython on 
Bitbucket [1] and GitHub [2]. They don't contain unofficial patches, but 
perhaps there are private clones with additional patches. Of course different 
Linux distributives can provide Python with own patches. And you can maintain 
private fork of CPython with your patches for your own or your company needs.

But if you needs only optimized regular expressions, I suggest you to look on 
the regex module [3]. It is more powerful and better supports Unicode.

Results of the same mickrobenchmarks for regex:

$ ./python -m timeit -s "import regex as re; p = re.compile('\n'); s = ('a'*100 
+ '\n')*1000" -- "p.split(s)"
1000 loops, best of 3: 544 usec per loop
$ ./python -m timeit -s "import regex as re; p = re.compile('(\n)'); s = 
('a'*100 + '\n')*1000" -- "p.split(s)"
1000 loops, best of 3: 661 usec per loop
$ ./python -m timeit -s "import regex as re; p = re.compile('\n\r'); s = 
('a'*100 + '\n\r')*1000" -- "p.split(s)"
1000 loops, best of 3: 521 usec per loop
$ ./python -m timeit -s "import regex as re; p = re.compile('(\n\r)'); s = 
('a'*100 + '\n\r')*1000" -- "p.split(s)"
1000 loops, best of 3: 743 usec per loop

regex is slightly slower than optimized re in these cases, but is much faster 
than non-optimized re in the case of splitting with capturing group.

> 2) Do you know if anybody has done a good writeup on the behavior of the 
> instruction stream to the C engine?  I could try to do some work on this and 
> put it with the package, if not, or point to it if so.

Sorry, I don't understood you. Do you mean documenting codes of compiled re 
pattern? This is implementation detail and will be changed in future.

[1] https://bitbucket.org/mirror/cpython
[2] https://github.com/python/cpython
[3] https://pypi.python.org/pypi/regex

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24426>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to