Peter> And which, at least implicitly, defines "greedy" by in section Peter> 6.3 titled "Greedy versus Non-Greedy". It's not perfect, but Peter> then nobody in this thread has offered anything even remotely Peter> resembling perfect documentation for regular expressions Peter> yet. <wink>
In the re syntax page: http://www.python.org/dev/doc/devel/lib/re-syntax.html the *?, +? and ?? operators *, + and ? are described as greedy: *?, +?, ?? The "*", "+", and "?" qualifiers are all greedy; they match as much text as possible. Sometimes this behaviour isn't desired; if the RE <.*> is matched against '<H1>title</H1>', it will match the entire string, and not just '<H1>'. Adding "?" after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using .*? in the previous expression will match only '<H1>'. {m,n}? is also described as a non-greedy version of {m,n} and A|B is described as never being greedy (if A matches, B is never tried). Perhaps there's no explicit definition of the word "greedy" in the context of regular expressions, but I think that after reading that page most people will at least have an intuitive notion of the meaning. If it's still unclear, a little experimentation should suffice: >>> import re >>> re.match("(a+)", "aaaaa").group(1) 'aaaaa' >>> re.match("(a+?)", "aaaaa").group(1) 'a' In short, I think the re docs are fine as-is w.r.t. the greedy concept. I also added a definition to the Python Glossary for good measure: http://www.python.org/moin/PythonGlossary Feel free to amend/enhance/correct as you see fit. (Feel free to flesh out any definitions for that matter, especially those with "???" as the definition.) Skip -- http://mail.python.org/mailman/listinfo/python-list