Martijn Pieters <m...@python.org> added the comment:

While RFC2047 clearly states that an encoder MUST not split multi-byte 
encodings in the middle of a character (section 5, "Each 'encoded-word' MUST 
represent an integral number of characters. A multi-octet character may not be 
split across adjacent 'encoded-word's.), it also states that to fit length 
restrictions, CRLF SPACE is used as a delimiter between encoded words (section 
2, "If it is desirable to encode more text than will fit in an 'encoded-word' 
of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be 
used."). In section 6.2 it states

   When displaying a particular header field that contains multiple
   'encoded-word's, any 'linear-white-space' that separates a pair of
   adjacent 'encoded-word's is ignored.  (This is to allow the use of
   multiple 'encoded-word's to represent long strings of unencoded text,
   without having to separate 'encoded-word's where spaces occur in the
   unencoded text.)

(linear-white-space is the RFC822 term for foldable whitespace).

The parser is leaving spaces between two encoded-word tokens in place, where it 
must remove them instead. And it is doing so correctly for unstructured 
headers, just not in get_bare_quoted_string, get_atom and get_dot_atom.

Then there is Postel's law (*be liberal in what you accept from others*), and 
the email package already applies that principle to RFC2047 elsewhere; RFC2047 
also states that "An 'encoded-word' MUST NOT appear within a 'quoted-string'." 
yet email._header_value_parser's handling of quoted-string will process EW 
sections.

----------
title: email.parser / email.policy does correctly handle multiple RFC2047 
encoded-word tokens across RFC5322 folded headers -> email.parser / 
email.policy does not correctly handle multiple RFC2047 encoded-word tokens 
across RFC5322 folded headers

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35547>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to