New submission from Tim Bell: According to RFC 5322, an email address like this isn't valid:
u...@example.com <u...@example.com> (The display-name "u...@example.com" contains "@", which isn't in the set of atext characters used to form an atom.) How it's handled by the email package varies by policy: >>> import email >>> from email.policy import default >>> email.message_from_bytes(b'To: u...@example.com <u...@example.com>')['to'] 'u...@example.com <u...@example.com>' >>> email.message_from_bytes(b'To: u...@example.com <u...@example.com>', >>> policy=default)['to'] 'u...@example.com' >>> email.message_from_bytes(b'To: u...@example.com <u...@example.com>', >>> policy=default).defects [] The difference between the behaviour under the compat32 vs "default" policy may or may not be significant. However, if coupled with a further invalid feature, namely a space after the ">", here's what happens: >>> email.message_from_bytes(b'To: u...@example.com <u...@example.com> ')['to'] 'u...@example.com <u...@example.com> ' >>> email.message_from_bytes(b'To: u...@example.com <u...@example.com> ', >>> policy=default)['to'] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/message.py", line 391, in __getitem__ return self.get(name) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/message.py", line 471, in get return self.policy.header_fetch_parse(k, v) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/policy.py", line 162, in header_fetch_parse return self.header_factory(name, value) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 586, in __call__ return self[name](name, value) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 197, in __new__ cls.parse(value, kwds) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 337, in parse kwds['parse_tree'] = address_list = cls.value_parser(value) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 328, in value_parser address_list, value = parser.get_address_list(value) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 2368, in get_address_list token, value = get_invalid_mailbox(value, ',') File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 2166, in get_invalid_mailbox token, value = get_phrase(value) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 1770, in get_phrase token, value = get_word(value) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 1745, in get_word if value[0]=='"': IndexError: string index out of range >>> email.message_from_bytes(b'To: u...@example.com <u...@example.com> ', >>> policy=default).defects [] I believe that the preferred behaviour would be to add a defect to the message object during parsing instead of throwing an exception when the invalid header value is accessed. ---------- components: email messages: 296309 nosy: barry, r.david.murray, timb07 priority: normal severity: normal status: open title: Exception parsing certain invalid email address headers type: behavior versions: Python 3.5, Python 3.6, Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30701> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com