Tim Chase wrote:
> > In [1]: import re
> >
> > In [2]: aba_re = re.compile('aba')
> >
> > In [3]: aba_re.findall('abababa')
> > Out[3]: ['aba', 'aba']
> >
> > The return is two matches, whereas, I expected three. Why does this
> > regular expression work this way?

It's just the way regexes work.  You may disagree, but it's more
intuitive that iterated pattern searching be non-overlapping by
default.  See also:

  >>> 'abababa'.count('aba')
  2

> Well, if you don't need the actual results, just their
> count, you can use
>
> how_many = len(re.findall('(?=aba)', 'abababa')
>
> which will return 3.  However, each result is empty:
>
>       >>> print re.findall('(?=aba)', 'abababa')
>       ['','','']
>
> You'd have to do some chicanary to get the actual pieces:
(snip)

Actually, you can just define a group inside the lookahead assertion:

  >>> re.findall('(?=(aba))', 'abababa')
  ['aba', 'aba', 'aba']

--Ben

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to