R. David Murray <rdmur...@bitdance.com> added the comment:

The problem is that you are starting with different inputs.  unicode strings 
and bytes are different things, and so parsing them can produce different 
results.  The fact of that matter is that email messages are defined to be 
bytes, so parsing a unicode string pretending it is an email message is just 
asking for errors anyway.  The string parsing methods are really only provided 
for backward compatibility and historical reasons.

I thought this was clear from the existing documentation, but clearly it isn't 
:)  I'll review a suggested doc change, but the thing to explain is not that 
parse and parsebytes might produce different results, but that parsing email 
from strings is not a good idea and will likely produce unexpected results for 
anything except the simplest non-mime messages.

Note: the reason you got different checksums might have had to do with line 
ends, depending on how you calculated the checksums.  You should also consider 
using get_content and not get_payload.  get_payload has a weird legacy API that 
doesn't always do what you think it will, and that might be another source of 
checksum issues.  But really, parsing a unicode representation of a mime 
message is just likely to be buggy.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39071>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to