On Sunday, July 6, 2014 3:26:44 PM UTC-4, Ian wrote: > On Sun, Jul 6, 2014 at 12:57 PM, <rxjw...@gmail.com> wrote: > > > I write the following code: > > > > > > ....... > > > import re > > > > > > line = "abcdb" > > > > > > matchObj = re.match( 'a[bcd]*b', line) > > > > > > if matchObj: > > > print "matchObj.group() : ", matchObj.group() > > > print "matchObj.group(0) : ", matchObj.group() > > > print "matchObj.group(1) : ", matchObj.group(1) > > > print "matchObj.group(2) : ", matchObj.group(2) > > > else: > > > print "No match!!" > > > ......... > > > > > > In which I have used its match pattern, but the result is not 'abcb' > > > > You're never going to get a match of 'abcb' on that string, because > > 'abcb' is not found anywhere in that string. > > > > There are two possible matches for the given pattern over that string: > > 'abcdb' and 'ab'. The first one matches the [bcd]* three times, and > > the second one matches it zero times. Because the matching is greedy, > > you get the result that matches three times. It cannot match one, two > > or four times because then there would be no 'b' following the [bcd]* > > portion as required by the pattern. > > > > > > > > Only matchObj.group(0): abcdb > > > > > > displays. All other group(s) have no content. > > > > Calling match.group(0) is equivalent to calling match.group without > > arguments. In that case it returns the matched string of the entire > > regular expression. match.group(1) and match.group(2) will return the > > value of the first and second matching group respectively, but the > > pattern does not have any matching groups. If you want a matching > > group, then enclose the part that you want it to match in parentheses. > > For example, if you change the pattern to: > > > > matchObj = re.match('a([bcd]*)b', line) > > > > then the value of matchObj.group(1) will be 'bcd'
Because I am new to Python, I may not describe the question clearly. Could you read the original problem on web: https://docs.python.org/2/howto/regex.html It says that it gets 'abcb'. Could you explain it to me? Thanks again A step-by-step example will make this more obvious. Let's consider the expression a[bcd]*b. This matches the letter 'a', zero or more letters from the class [bcd], and finally ends with a 'b'. Now imagine matching this RE against the string abcbd. Step Matched Explanation 1 a The a in the RE matches. 2 abcbd The engine matches [bcd]*, going as far as it can, which is to the end of the string. 3 Failure The engine tries to match b, but the current position is at the end of the string, so it fails. 4 abcb Back up, so that [bcd]* matches one less character. 5 Failure Try b again, but the current position is at the last character, which is a 'd'. 6 abc Back up again, so that [bcd]* is only matching bc. 6 abcb Try b again. This time the character at the current position is 'b', so it succeeds. -- https://mail.python.org/mailman/listinfo/python-list