Fredrik Lundh added the comment: Well, I'm not sure 81k qualifies as "medium sized", really. If you look at the size distribution for typical RE:s (which are usually handwritten, not machine generated), that's one or two orders of magnitude larger than "medium".
(And even if this was guaranteed to work on all Python builds, my guess is that performance would be pretty bad compared to a using a minimal RE and checking potential matches against a set. The "|" operator is mostly O(N), not O(1).) As for fixing this, the "byte code" used by the RE engine uses a word size equal to the Unicode character size (sizeof(Py_UNICODE)) for the given platform. I don't think it would be that hard to set it to 32 bits also on platforms using 16-bit Unicode characters (if anyone would like to experiment, just set SRE_CODE to "unsigned long" in sre.h and see what happens when you run the test suite). __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1160> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com