Mystilleef wrote:

> Thanks for your response. I was going by the definition in
> the manual.

"non-overlapping" in that context means that if you e.g. search for "(ba)+"
in the string "bababa", you get one match ("bababa"), not three or six.

in your case, it sounds like you want a search for "ba" to return only one

> I know I can filter the list containing found matches myself, but that
> is somewhat expensive for a list containing thousands of matches.

if the order doesn't matter, you don't have to build a list:

>>> text = "cat catched catnip cat catatonic cat cat cat kat"
>>> set( for m in re.finditer("cat\w*", text))
set(['catatonic', 'catnip', 'catched', 'cat'])

if you need to preserve the order, you could use a combination of a
list and a set (or a dictionary):

>>> s = set(); w = []
>>> for m in re.finditer("cat\w*", text):
...     m =
...     if m not in s:
...             s.add(m); w.append(m)
>>> w
['cat', 'catched', 'catnip', 'catatonic']



Reply via email to