New submission from J.B. Langston <jblangs...@datastax.com>:

The following code will cause Python's regex engine to hang apparently 
indefinitely: 

import re
message = "Flushed to 
[BigTableReader(path='/data/cassandra/data/log/logEntry_202202-e68971800b2711ecaf770d5fa3f5ae87/md-112-big-Data.db')]
 (1 sstables, 8,650MiB), biggest 8,650MiB, smallest 8,650MiB"
regex = re.compile(r"Flushed to \[(?P<sstables>[^]]+)+\] \((?P<sstable_count>[^ 
]+) sstables, (?P<total_size>[^)]+)\), biggest (?P<biggest_size>[^,]+), 
smallest (?P<smallest_size>[^ ]+)( \((?P<duration>\d+)ms\))?")
regex.match(message)

This may be a case of exponential backtracking similar to #35915 or #30973. 
Both of these issues have been closed as Wont Fix, and I suspect my issue is 
similar. The use of commas for decimal points in the input string was not 
anticipated but happened due to localization of the logs that the message came 
from.  The regex works properly when the decimal point is a period.

I will try to rewrite my regex to address this specific issue, but it's hard to 
anticipate every possible input and craft a bulletproof regex, so something 
like this kind of thing can be used for a denial of service attack (intentional 
or not). In this case the regex was used in an automated import process and 
caused the process to back up for many hours before someone noticed.  Maybe a 
solution could be to add a timeout option to the regex engine so it will give 
up and throw an exception if the regex executes for longer than the configured 
timeout.

----------
components: Regular Expressions
messages: 412450
nosy: ezio.melotti, jblangston, mrabarnett
priority: normal
severity: normal
status: open
title: Regex hangs indefinitely
type: behavior
versions: Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46627>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to