Mahesh wrote:

I needed to get to the POST body and while I was trying out various
regular expressions, one of them caused Python to hang. The Python
process was taking up 100% of the CPU. I couldn't even see the "Max
recursion depth exceeded message". Is this a bug?

no, it's just a very stupid way to implement a trivial operation.

import re

s = \
"""POST /TradeManagement-RT3/ReportController.Servlet HTTP/1.1
/snip>

#pattern_str = "^POST.*\\r\\n\\r((\\n)|(\\n[^\r]*))"
#pattern_str = "^POST.*\\n((\\n)|(\\n[^\r]*&))"
pattern_str = "^POST(.*\\n*)+\\n\\n" # <--- Offending pattern

the first .* is a variable-length match. so is the second .*. and then you're putting it inside a repeated capturing group. and then you're applying it to a moderately large string. the poor engine has to check zillions of combinations before finding something that works.

if you want to split on "\r\n\r\n", use split:

   header, body = message.split("\r\n\r\n")

for more robust code, consider using the rfc822 module:

f = StringIO.String(message)
request = f.readline() header = rfc822.Message(f)
body = f.read()


</F>

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to