On 2020-12-05 23:42:11 +0100, sjeik_ap...@hotmail.com wrote: > Timeout: no idea. But check out re.compile and re.iterfind as they might > speed things up.
I doubt that compiling regular expressions helps the OP much. Compiled regular expressions are cached, but more importantly, if a match takes long enough that specifying a timeout is useful, the time is almost certainly not spent compiling, but matching - most likely backtracking from lots of promising but ultimately unsuccessful partial matches. > regex = r'data-stid="section-room-list"[\s\S]*?>\s*([\s\S]*?)\s*' \ > > > r'(?:class\s*=\s*"\s*sticky-book-now\s*"|</ul>\s*</section>|id\s*=\s*"Location")' > rooms_blocks_to_be_replace = re.findall(regex, html_template) This part: \s*([\s\S]*?)\s*' looks dangerous from a performance point of view. If that can be rewritten with less potential for backtracking, it might help. Generally, it should be possible to implement a timeout for any operation by either scheduling an alarm with signal.alarm or by executing the operation in a separate thread and killing the thread if it takes too long. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | h...@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list