[issue1079] decode_header does not follow RFC 2047

Ralf Schlatterbeck Mon, 02 Jan 2012 08:09:59 -0800

Ralf Schlatterbeck <r...@runtux.com> added the comment:

maybe it would be a good start to include the examples at the end of RFC2047 
into the regression tests? These examples at least support the case that a '?' 
may immediately follow an encoded string:


encoded form                                displayed as
(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)     (ab)

when trying this in python 2.7:

>>> decode_header ('(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)')
[('(', None), ('a', 'iso-8859-1'), ('=?ISO-8859-1?Q?b?=)', None)]

this fails. So I consider this a bug.

Note that although RFC2047 is vague concerning the interpretation if two 
encoded strings could follow each other without a whitespace, these *are* seen 
in the wild and *are* interpreted correctly by the mailers I've tested: mutt, 
thunderbird, exchange in various versions, even lotus notes seems to get this 
right. So I guess python should be "liberal in what you accept" and parse 
something like 
'(=?ISO-8859-1?Q?a?==?ISO-8859-1?Q?b?=)'
into
[ ('(', None)
, ('a', 'iso-8859-1')
, ('b', 'iso-8859-1')
, (')', None)
]

----------
nosy: +runtux

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1079>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1079] decode_header does not follow RFC 2047

Reply via email to