[issue1974] email.MIMEText.MIMEText.as_string incorrectly folding long subject header

Ori Avtalion Wed, 25 Jun 2008 12:43:05 -0700

Ori Avtalion <[EMAIL PROTECTED]> added the comment:

I think there's been a little misinterpretation of the standard in the
comments above.


It's important to note that RFC 2822 basically defines folding as
"adding a CRLF before an existing whitespace in the original message". 

See http://tools.ietf.org/html/rfc2822#section-2.2.3

It does *not* allow prepending folded lines with extra characters that
were not in the original message such as '\t' or ' '.

This is exactly what _encode_chunks does in header.py:
    joiner = NL + self._continuation_ws

(Note that the email package docs and Header docstring use the word
'prepend' which is reflects the error in the code).

With a correct implementation, why would I want to choice of which type
of character to line-break on when folding?
The whole notion of controlling the value of continuation_ws seems wrong.

However, changing the default continuation_ws to ' ', as the patch
suggests, will output syntactically correct headers in the majority of
cases (due to other bugs that remove trailing whitespace and merge
consecutive whitespace into one character).


All in all, I agree with the change of the default continuation_ws due
to its lucky side-effects, but as Barry hinted, the algorithm needs some
serious work to really output valid headers.

Some examples of the good and bad behaviors:

>>> from email.Header import Header
>>> l = ['<[EMAIL PROTECTED]>' % i for i in range(8)]

>>> # this turns out fine
>>> Header(' '.join(l), continuation_ws=' ').encode()
'<[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL 
PROTECTED]> <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>\n <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>'

# This does not fold even though it should
>>> Header('\t'.join(l), continuation_ws=' ').encode()
'<[EMAIL PROTECTED]>\t<[EMAIL PROTECTED]>\t<[EMAIL PROTECTED]>\t<[EMAIL 
PROTECTED]>\t<[EMAIL PROTECTED]>\t<[EMAIL PROTECTED]>\t<[EMAIL 
PROTECTED]>\t<[EMAIL PROTECTED]>'

# And here the 4-char whitespace is shrinked into one
>>> Header('    '.join(l), continuation_ws=' ').encode()
'<[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL 
PROTECTED]> <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]>\n <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>'

----------
nosy: +salty-horse

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1974>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1974] email.MIMEText.MIMEText.as_string incorrectly folding long subject header

Reply via email to