Il giorno domenica 8 maggio 2016 18:16:56 UTC+2, Peter Otten ha scritto: > Sergio Spina wrote: > > > In the following ipython session: > > > >> Python 3.5.1+ (default, Feb 24 2016, 11:28:57) > >> Type "copyright", "credits" or "license" for more information. > >> > >> IPython 2.3.0 -- An enhanced Interactive Python. > >> > >> In [1]: import re > >> > >> In [2]: patt = r""" # the match pattern is: > >> ...: .+ # one or more characters > >> ...: [ ] # followed by a space > >> ...: (?=[@#D]:) # that is followed by one of the > >> ...: # chars "@#D" and a colon ":" > >> ...: """ > >> > >> In [3]: pattern = re.compile(patt, re.VERBOSE) > >> > >> In [4]: m = pattern.match("Jun@i Bun#i @:Janji") > >> > >> In [5]: m.group() > >> Out[5]: 'Jun@i Bun#i ' > >> > >> In [6]: m = pattern.match("Jun@i Bun#i @:Janji D:Banji") > >> > >> In [7]: m.group() > >> Out[7]: 'Jun@i Bun#i @:Janji ' > >> > >> In [8]: m = pattern.match("Jun@i Bun#i @:Janji D:Banji #:Junji") > >> > >> In [9]: m.group() > >> Out[9]: 'Jun@i Bun#i @:Janji D:Banji ' > > > > Why the regex engine stops the search at last piece of string? > > Why not at the first match of the group "@:"? > > What can it be a regex pattern with the following result? > > > >> In [1]: m = pattern.match("Jun@i Bun#i @:Janji D:Banji #:Junji") > >> > >> In [2]: m.group() > >> Out[2]: 'Jun@i Bun#i ' > > Compare: > > >>> re.compile("a+").match("aaaa").group() > 'aaaa' > >>> re.compile("a+?").match("aaaa").group() > 'a' > > By default pattern matching is "greedy" -- the ".+" part of your regex > matches as many characters as possible. Adding a ? like in ".+?" triggers > non-greedy matching.
> In [2]: patt = r""" # the match pattern is: > ...: .+ # one or more characters > ...: [ ] # followed by a space > ...: (?=[@#D]:) # ONLY IF is followed by one of the <<< please note > ...: # chars "@#D" and a colon ":" > ...: """ >From the python documentation > (?=...) > Matches if ... matches next, but doesn't consume any of the string. > This is called a lookahead assertion. For example, > Isaac (?=Asimov) will match 'Isaac ' only if it's followed by 'Asimov'. I know about greedy and not-greedy, but the problem remains. -- https://mail.python.org/mailman/listinfo/python-list