[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

Jeffrey C. Jacobs Wed, 24 Sep 2008 08:09:43 -0700

Jeffrey C. Jacobs <[EMAIL PROTECTED]> added the comment:

Thanks for weighing in Matthew!


Yeah, I do get some flack for item 2 because originally item 3 wasn't
supposed to cover named groups but on investigation it made sense that
it should.  I still prefer 2 over-all but the nice thing about them
being separate items is that we can accept 2 or 3 or both or neither,
and for the most part development for the first phase of 2 is complete
though there is still IMHO the issue of UNICODE name groups (visa-vi
item 14) and the name collision problem which I propose fixing with an
Attribute / re.A flag.  So, I think it may end up that we could support
both 3 by default and 2 via a flag or maybe 3 and 2 both but with 2 as
is, with name collisions hidden (i.e. if you have r'(?P<string>...)' as
your capture group, typing m.string will still give you the original
comparison string, as per the current python documentation) but have
collision-checking via the Attribute flag so that with
r'(?A)(?P<string>...)' would not compile because string is a reserved word.

Your interpretation of 4 matches mine, though, and I would definitely
suggest using Perl's \g<-n> notation for relative back-references, but
further, I was thinking, if not part of 4, part of the catch-all item 11
to add support for Perl's (?<name>...) as a synonym for Python's
(?P<name>...) and Perl's \k<name> for Python's (?P=name) notation.  The
evolution of Perl's name group is actually interesting.  Years ago,
Guido had a conversation with Larry Wall about using the (?P...) capture
sequence for python-specific Regular Expression blocks.  So Python went
ahead and implemented named capture groups.  Years later, the Perl folks
thought named capture groups were a neat idea and adapted them in the
(?<...>...) form because Python had restricted the (?P...) notation to
themselves so they couldn't use our even if they wanted to.  Now,
though, with Perl adapting (?<...>...), I think it inevitable that Java
and even C++ may see this as the defacto standard.  So I 100% agree, we
should consider supporting (?<name>...) in the parser.

Oh, and as I suggested in Issue 3825, I have these new item proposals:

Item 18: Add a re.REVERSE, re.R (?r) flag for reversing the direction of
the String Evaluation against a given Regular Expression pattern. See
issue 516762, as implemented in Issue 3825.

Item 19: Make various in-line flags positionally dependant, for example
(?i) makes the pattern before this case-sensitive but after it
case-insensitive. See Issue 433024, as implemented in Issue 3825.

Item 20: All the negation of in-line flags to cancel their effect in
conditionally flagged expressions for example (?-i). See Issue 433027,
as implemented in Issue 3825.

Item 21: Allow for scoped flagged expressions, i.e. (?i:...), where the
flag(s) is applied to the expression within the parenthesis. See Issue
433028, as implemented in Issue 3825.

Item 22: Zero-width regular expression split: when splitting via a
regular expression of Zero-length, this should return an expression
equivalent to splitting at each character boundary, with a null string
at the beginning and end representing the space before the first and
after the last character. See issue 3262.

Item 23: Character class ranges over case-insensitive matches, i.e. does
"(?i)[9-A]" contain '_' , whose ord is greater than the ord of 'A' and
less than the ord of 'a'. See issue 5311.

And I shall create a bazaar repository for your current development line
with the unfortunately unwieldy name of
lp:~timehorse/python/issue2636-01+09-02+17+18+19+20+21 as that would,
AFAICT, cover all the items you've fixed in your latest patch.

Anyway, great work Matthew and I look forward to working with you on
Regexp 2.7 as you do great work!

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2636>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

Reply via email to