New submission from Mark Sapiro: Given an admittedly defective (the folded Content-Type: isn't indented) message part with the following headers/body
------------------------------- Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip" Content-Type: application/x-rar-compressed; x-unix-mode=0600; name="04EBD_xxxx.xxxx_A546BB.zip" Content-Transfer-Encoding: base64 UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6 ... ------------------------------- email.parser parses the headers as ------------------------------- Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip" Content-Type: application/x-rar-compressed; x-unix-mode=0600; ------------------------------- and the body as ------------------------------- name="04EBD_xxxx.xxxx_A546BB.zip" Content-Transfer-Encoding: base64 UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6 ... ------------------------------- and shows no defects. This is wrong. RFC5322 section 2.1 is clear that everything up to the first empty line is headers. Even the docstring in the email/parser.py module says "The header block is terminated either by the end of the string or by a blank line." Since the message is defective, it isn't clear what the correct result should be, but I think Headers: Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip" Content-Type: application/x-rar-compressed; x-unix-mode=0600; Content-Transfer-Encoding: base64 Body: UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6 ... Defects: name="04EBD_xxxx.xxxx_A546BB.zip" would be more appropriate. The problem is that the Content-Transfer-Encoding: base64 header is not in the headers so that get_payload(decode=True) doesn't decode the base64 encoded body making malware recognition difficult. ---------- components: Library (Lib) messages: 262750 nosy: msapiro priority: normal severity: normal status: open title: email.parser stops parsing headers too soon. type: behavior versions: Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26686> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com