Tim Chase wrote: > > In [1]: import re > > > > In [2]: aba_re = re.compile('aba') > > > > In [3]: aba_re.findall('abababa') > > Out[3]: ['aba', 'aba'] > > > > The return is two matches, whereas, I expected three. Why does this > > regular expression work this way?
It's just the way regexes work. You may disagree, but it's more intuitive that iterated pattern searching be non-overlapping by default. See also: >>> 'abababa'.count('aba') 2 > Well, if you don't need the actual results, just their > count, you can use > > how_many = len(re.findall('(?=aba)', 'abababa') > > which will return 3. However, each result is empty: > > >>> print re.findall('(?=aba)', 'abababa') > ['','',''] > > You'd have to do some chicanary to get the actual pieces: (snip) Actually, you can just define a group inside the lookahead assertion: >>> re.findall('(?=(aba))', 'abababa') ['aba', 'aba', 'aba'] --Ben -- http://mail.python.org/mailman/listinfo/python-list