On Tue, 10 Jun 2008 21:20:14 -0700 (PDT), cirfu <[EMAIL PROTECTED]> wrote: > pat = re.compile("(\w* *)*") > this matches all sentences. > if fed the string "are you crazy? i am" it will return "are you > crazy". > > i want to find a in a big string a sentence containing Zlatan > Ibrahimovic and some other text. > ie return the first sentence containing the name Zlatan Ibrahimovic. > > patzln = re.compile("(\w* *)* zlatan ibrahimovic (\w* *)*") > should do this according to regexcoach but it seems to send my > computer into 100%CPU-power and not closable.
Some regular expressions are exponentially expensive to evaluate. This web page http://www.cs.rice.edu/~scrosby/hash/ drew my attention to the (contrived) example "(x+x+)+y", whose behavior (under Python 2.4.3) is this: seconds to scan ... ------------------- # x's xxx...xy xxx...x ----- -------- ------- 10 0.0 0.0 11 0.0 0.0 12 0.0 0.0 13 0.0 0.01 14 0.0 0.0 15 0.0 0.01 16 0.0 0.03 17 0.0 0.04 18 0.0 0.09 19 0.0 0.16 20 0.0 0.33 21 0.0 0.65 22 0.0 1.3 23 0.0 2.58 24 0.0 5.18 25 0.0 10.27 26 0.0 20.41 I'm no regular-expression guru, but your "(\w* *)*..." example does bear a remarkable resemblance to "(x+x+)+y". -- To email me, substitute nowhere->spamcop, invalid->net. -- http://mail.python.org/mailman/listinfo/python-list