Re: re module non-greedy matches broken

2005-04-06 Thread lothar
well done. i had not noticed the lookahead operators. "André Malo" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > * lothar wrote: > -- http://mail.python.org/mailman/listinfo/python-list

Re: re module non-greedy matches broken

2005-04-05 Thread John Ridley
--- lothar <[EMAIL PROTECTED]> wrote: > a non-greedy match is implicitly defined in the documentation to be > one such > that there is no proper substring in the return which could also > match the regex. > If I understand this correctly, what you are asking is for re to look for, or rather, ant

Re: re module non-greedy matches broken

2005-04-05 Thread Fredrik Lundh
"lothar" wrote: >a non-greedy match is implicitly defined in the documentation to be one such > that there is no proper substring in the return which could also match the > regex. no, that's not what it says. this is what is says: Adding "?" after the qualifier makes it perform the match in

Re: re module non-greedy matches broken

2005-04-05 Thread Andrà Malo
* lothar wrote: As already said by Georg, regexes are the wrong tool for such tasks, but anyway... > give an re to find every innermost "table" element: ]*)?>[^<]*(?:<(?!/table>|table(?:\s[^>]*)?>)[^<]*)* > give an re to find every "pre" element directly followed by an "a" > element: ]*)?>[^<]

Re: re module non-greedy matches broken

2005-04-05 Thread Georg Brandl
lothar wrote: > give an re to find every innermost "table" element: > > innertabdoc = """ > > > >n > > > > > > > > > >y z > > > > > > > > > > > > > """ REs are Regular Expressions, not parsers. There are problems for which there is no RE

Re: re module non-greedy matches broken

2005-04-05 Thread Andrà Malo
* lothar wrote: > a non-greedy match - as implicitly defined in the documentation - is a > match in which there is no proper substring in the return which could also > match the regex. Your argumentation is starting at the wrong place. The documentation doesn't define the bahviour, it tries to de

Re: re module non-greedy matches broken

2005-04-05 Thread lothar
a non-greedy match is implicitly defined in the documentation to be one such that there is no proper substring in the return which could also match the regex. the documentation implies the module will return a non-greedy match. the module does not return a non-greedy match. "Fredrik Lundh" <[EM

Re: re module non-greedy matches broken

2005-04-05 Thread lothar
give an re to find every innermost "table" element: innertabdoc = """ n y z """ give an re to find every "pre" element directly followed by an "a" element: preadoc = """ a r n l y r f g z m b u c v u """ "John Ridley" <[EMAIL

Re: re module non-greedy matches broken

2005-04-05 Thread lothar
a non-greedy match - as implicitly defined in the documentation - is a match in which there is no proper substring in the return which could also match the regex. you are skirting the issue as to why a matcher should not be able to return a non-greedy match. there is no theoretical reason why it

Re: re module non-greedy matches broken

2005-04-05 Thread Fredrik Lundh
"lothar" wrote: > with respect to the documentation, the module is broken. nope. > the module does not necessarily deliver a "minimal length" match for a > non-greedy pattern. it isn't supposed to: a regular expression describes a *set* of matching strings, and the engine is free to return any

Re: re module non-greedy matches broken

2005-04-04 Thread André Malo
* "lothar" <[EMAIL PROTECTED]> wrote: > no - in the non-greedy regex > <1st-pat>*? > > <1st-pat>, and are arbitrarily complex patterns. The "not" is the problem. Regex patterns are expressed positive by definition (meaning, you can say, what you expect, but not what you don't expect). In oth

Re: re module non-greedy matches broken

2005-04-04 Thread John Ridley
--- lothar <[EMAIL PROTECTED]> wrote: > no - in the non-greedy regex > <1st-pat>*? > > <1st-pat>, and are arbitrarily complex > patterns. Could you post some real-world examples of the problems you are trying to deal with, please? Trying to come up with general solutions for arbitrarily comp

Re: re module non-greedy matches broken

2005-04-04 Thread Christopher Weimann
On 04/04/2005-04:20PM, lothar wrote: > > how then, do i specify a non-greedy regex > <1st-pat>*? > > that is, such that non-greedy part *? > excludes a match of <1st-pat> > jet% cat vwre2.py #! /usr/bin/env python import re vwre = re.compile("V[^V]W") vwlre = re.compile("V[^V]WL") if __nam

Re: re module non-greedy matches broken

2005-04-04 Thread lothar
no - in the non-greedy regex <1st-pat>*? <1st-pat>, and are arbitrarily complex patterns. with character classes and negative character classes you do not need non-greediness anyway. "John Ridley" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > --- lothar <[EMAIL PROTECTED]>

Re: re module non-greedy matches broken

2005-04-04 Thread lothar
with respect to the documentation, the module is broken. the module does not necessarily deliver a "minimal length" match for a non-greedy pattern. "Fredrik Lundh" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > "lothar" wrote: > > > this is a bug and it needs to be fixed. > > it's

Re: re module non-greedy matches broken

2005-04-04 Thread John Ridley
--- lothar <[EMAIL PROTECTED]> wrote: > how then, do i specify a non-greedy regex > <1st-pat>*? > > that is, such that non-greedy part *? > excludes a match of <1st-pat> > > in other words, how do i write regexes for my examples? Not sure if I completely understand your explanation, but does

Re: re module non-greedy matches broken

2005-04-04 Thread Swaroop C H
On Apr 4, 2005 10:06 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > > what book or books on regexes > A standard is Mastering Regular Expressions, 2nd ed, by xxx (sorry, forget) Mastering Regular Expressions, by Jeffrey Friedl See http://www.regex.info/ Regards, -- Swaroop C H Blog: http://www.swa

Re: re module non-greedy matches broken

2005-04-04 Thread Terry Reedy
> what book or books on regexes A standard is Mastering Regular Expressions, 2nd ed, by xxx (sorry, forget) TJR -- http://mail.python.org/mailman/listinfo/python-list

Re: re module non-greedy matches broken

2005-04-04 Thread lothar
how then, do i specify a non-greedy regex <1st-pat>*? that is, such that non-greedy part *? excludes a match of <1st-pat> in other words, how do i write regexes for my examples? what book or books on regexes or with a good section on regexes would you recommend? Hopcroft and Ullman? "André M

Re: re module non-greedy matches broken

2005-04-04 Thread Fredrik Lundh
"lothar" wrote: > this is a bug and it needs to be fixed. it's not a bug, and it's not going to be "fixed". search, findall, finditer, sub, etc. all scan the target string from left to right, and process the first location (or all locations) where the pattern matches. -- http://mail.pyt

Re: re module non-greedy matches broken

2005-04-03 Thread André Malo
* "lothar" <[EMAIL PROTECTED]> wrote: > this response is nothing but a description of the behavior i reported. Then you have not read my response carefully enough. > as to whether this behaviour was intended, one would have to ask the module > writer about that. No, I've responded with a view o

Re: re module non-greedy matches broken

2005-04-03 Thread lothar
this response is nothing but a description of the behavior i reported. as to whether this behaviour was intended, one would have to ask the module writer about that. because of the statement in the documentation, which places no qualification on how the scan for the shortest possible match is to b

Re: re module non-greedy matches broken

2005-04-03 Thread Andrà Malo
* lothar wrote: > re: > 4.2.1 Regular Expression Syntax > http://docs.python.org/lib/re-syntax.html > > *?, +?, ?? > Adding "?" after the qualifier makes it perform the match in non-greedy > or > minimal fashion; as few characters as possible will be matched. > > the regular expression mod