New submission from INADA Naoki <songofaca...@gmail.com>: email.header has this pattern:
https://github.com/python/cpython/blob/85c0b8941f0c8ef3ed787c9d504712c6ad3eb5d3/Lib/email/header.py#L34-L43 # Match encoded-word strings in the form =?charset?q?Hello_World?= ecre = re.compile(r''' =\? # literal =? (?P<charset>[^?]*?) # non-greedy up to the next ? is the charset \? # literal ? (?P<encoding>[qb]) # either a "q" or a "b", case insensitive \? # literal ? (?P<encoded>.*?) # non-greedy up to the next ?= is the encoded string \?= # literal ?= ''', re.VERBOSE | re.IGNORECASE | re.MULTILINE) Since only 's' and 'i' has other lower case character, this is not a real bug. But using re.ASCII is more safe. Additionally, email.util has same pattern from 10 years ago, and it is not used by anywhere. It should be removed. ---------- components: Regular Expressions messages: 303612 nosy: ezio.melotti, inada.naoki, mrabarnett priority: normal severity: normal status: open title: email.header uses re.IGNORECASE without re.ASCII versions: Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue31677> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com