New submission from Mark Sapiro:

Given an admittedly defective (the folded Content-Type: isn't indented) message 
part with the following headers/body

-------------------------------
Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip"
Content-Type: application/x-rar-compressed; x-unix-mode=0600;
name="04EBD_xxxx.xxxx_A546BB.zip"
Content-Transfer-Encoding: base64

UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw
ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6
...
-------------------------------

email.parser parses the headers as

-------------------------------
Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip"
Content-Type: application/x-rar-compressed; x-unix-mode=0600;
-------------------------------

and the body as

-------------------------------
name="04EBD_xxxx.xxxx_A546BB.zip"
Content-Transfer-Encoding: base64

UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw
ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6
...
-------------------------------

and shows no defects.

This is wrong. RFC5322 section 2.1 is clear that everything up to the first 
empty line is headers. Even the docstring in the email/parser.py module says 
"The header block is terminated either by the end of the string or by a blank 
line."

Since the message is defective, it isn't clear what the correct result should 
be, but I think

Headers:
Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip"
Content-Type: application/x-rar-compressed; x-unix-mode=0600;
Content-Transfer-Encoding: base64

Body:
UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw
ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6
...

Defects:
name="04EBD_xxxx.xxxx_A546BB.zip"

would be more appropriate. The problem is that the Content-Transfer-Encoding: 
base64 header is not in the headers so that get_payload(decode=True) doesn't 
decode the base64 encoded body making malware recognition difficult.

----------
components: Library (Lib)
messages: 262750
nosy: msapiro
priority: normal
severity: normal
status: open
title: email.parser stops parsing headers too soon.
type: behavior
versions: Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26686>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to