Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment: Update 16 Sep 2008:
Based on the work for issue #3825, I would like to simply update the item list as follows: 1) Atomic Grouping / Possessive Qualifiers (See also Issue #433030) [Complete] 2) Match group names as attributes (e.g. match.foo) [Complete save issues outlined above] 3) Match group indexing (e.g. match['foo'], match[3]) 4) Perl-style back-references (e.g. compile(r'(a)\g{-1}'), and possibly adding the r'\k' escape sequence for keywords. 5) Parenthesis-Aware Python Comment (e.g. r'(?P#...)') [Complete] 6) Expose support for Template expressions (expressions without repeat operators), adding test cases and documentation for existing code. 7) Larger compiled Regexp cache (256 vs. 100) and reduced thrashing risk. [Complete] 8) Character Classes (e.g. r'[:alphanum:]') 9) Proposed Engine redesigns and cleanups (core item only contains cleanups and comments to the current design but does not modify the design). 9-1) Single-loop Engine redesign that runs 8% slower than current. [Complete] 9-1-1) 3-loop Engine redesign that runs 10% slower than current. [Complete] 9-2) Matthew Bernett's Engine redesign as per issue #3825 10) Have all C-Python shared constants stored in 1 place (sre_constants.py) and generated by that into C constants (sre_constants.h). [Complete AFAICT] 11) Scan Perl 5.10.0 for other potential additions that could be implemented for Python. 12) Documentation suggestions by Jim J. Jewett [Complete] 13) Add grouptuples method to the Match object (i.e. match.grouptuples() returns (<index>, <name or None>, <value>) ) suitable for iteration. 14) UNICODE match group names, as per PEP-3131. 15) Add __doc__ strings and other Python niceties to the Pattern_Type, Match_Type and Scanner_Type (experimental). 16) Implement any remaining TODOs and FIXMEs in the Regexp modules. 16-1) Allow for the disassociation of a source string from a Match_Type, assuming this will still leave the object in a "reasonable" state. 17) Variable-length [Positive and Negative] Look-behind assertions, as described and implemented in Issue #3825. --- Now, we have a combination of Items 1, 9-2 and 17 available in issue #3825, so for now, refer to that issue for the 01+09-02+17 combined solution. Eventually, I hope to merge the work between this and that issue. I sadly admit I have made not progress on this since June because managing 30 some lines of development, some of which having complex diamond branching, e.g.: 01 is the child of Issue2636 09 is the child of Issue2636 10 is the child of Issue2636 09-01 is the child of 09 09-01-01 is the child of 09-01 01+09 is the child of 01 and 09 01+10 is the child of 01 and 10 09+10 is the child of 09 and 10 01+09-01 is the child of 01 and 09-01 01+09-01-01 is the child of 01 and 09-01-01 09-01+10 is the child of 09-01 and 10 09-01-01+10 is the child of 09-01-01 and 10 Which all seems rather simple until you wrap your head around: 01+09+10 is the child of 01, 09, 10, 01+09, 01+10 AND 09+10! Keep in mind the reason for all this complex numbering is because many issues cannot be implemented in a vacuum: If you want Atomic Grouping, that's 1 implementation, if you want Shared Constants, that's a different implementation. but if you want BOTH Atomic Grouping and Shared Constants, that is a wholly other implementation because each implementation affects the other. Thus, I end up with a plethora of branches and a nightmare when it comes to merging which is why I've been so slow in making progress. Bazaar seems to be very confused when it comes to a merge in 6 parts between, for example 01, 09, 10, 01+09, 01+10 and 09+10, as above. It gets confused when it sees the same changes applied in a previous merge applied again, instead of simply realizing that the change in one since last merge is EXACTLY the same change in the other since last merge so effectively there is nothing to do; instead, Bazaar gets confused and starts treating code that did NOT change since last merge as if it was changed and thus tries to role back the 01+09+10-specific changes rather than doing nothing and generates a conflict. Oh, that I could only have a version control system that understood the kind of complex branching that I require! Anyway, that's the state of things; this is me, signing out! ---------- title: Regexp 2.6 (modifications to current re 2.2.2) -> Regexp 2.7 (modifications to current re 2.2.2) versions: +Python 2.7 -Python 2.6 _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2636> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com