On Aug 12, 12:21 pm, [EMAIL PROTECTED] wrote: > > I cannot understand why 'c' constitutes a group here without being > surrounded by "(" ,")" ? > > >>>import re > >>> m = re.match("([abc])+", "abc") > >>> m.groups() > > ('c',) >
It sounds from the other replies that this is just the way re's work - if a group is represented multiple times in the matched text, only the last matching text is returned for that group. This sounds similar to a behavior in pyparsing, in using a results name for the parsed results. Here is an annotated session using pyparsing to extract this data. The explicit OneOrMore and Group classes and oneOf method give you a little more control over the collection and structure of the results. -- Paul Setup to use pyparsing, and define input string. >>> from pyparsing import * >>> data = "abc" Use a simple pyparsing expression - matches and returns each separate character. Each inner match can be returned as element [0], [1], or [2] of the parsed results. >>> print OneOrMore( oneOf("a b c") ).parseString(data) ['a', 'b', 'c'] Add use of Group - each single-character match is wrapped in a subgroup. >>> print OneOrMore( Group(oneOf("a b c")) ).parseString(data) [['a'], ['b'], ['c']] Instead of Group, set a results name on the entire pattern. >>> pattern = OneOrMore( oneOf("a b c") ).setResultsName("char") >>> print pattern.parseString(data)['char'] ['a', 'b', 'c'] Set results name on the inner expression - this behavior seems most like the regular expression behavior described in the original post. >>> pattern = OneOrMore( oneOf("a b c").setResultsName("char") ) >>> print pattern.parseString(data)['char'] c Adjust results name to retain all of the matched characters for the given results name. >>> pattern = OneOrMore( oneOf("a b >>> c").setResultsName("char",listAllMatches=True) ) >>> print pattern.parseString(data)['char'] ['a', 'b', 'c'] -- http://mail.python.org/mailman/listinfo/python-list