Another question relative to regular expressions.

How to extract all word duplicates in a given text by use of regular expression methods ? To make the question concrete, if the text is

------------------
Now is better than never.
Although never is often better than *right* now.
------------------

duplicates are :

------------------------
better is now than never
------------------------

Some code can solve the question, for instance

# ------------------
import re

regexp=r"\w+"

c=re.compile(regexp, re.IGNORECASE)

text="""
Now is better than never.
Although never is often better than *right* now."""

z=[s.lower() for s in c.findall(text)]

for d in set([s for s in z if z.count(s)>1]):
    print d,
# ------------------

but I'm in search of "plain" re code.



--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to