Martijn Pieters <m...@python.org> added the comment:
While RFC2047 clearly states that an encoder MUST not split multi-byte encodings in the middle of a character (section 5, "Each 'encoded-word' MUST represent an integral number of characters. A multi-octet character may not be split across adjacent 'encoded-word's.), it also states that to fit length restrictions, CRLF SPACE is used as a delimiter between encoded words (section 2, "If it is desirable to encode more text than will fit in an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be used."). In section 6.2 it states When displaying a particular header field that contains multiple 'encoded-word's, any 'linear-white-space' that separates a pair of adjacent 'encoded-word's is ignored. (This is to allow the use of multiple 'encoded-word's to represent long strings of unencoded text, without having to separate 'encoded-word's where spaces occur in the unencoded text.) (linear-white-space is the RFC822 term for foldable whitespace). The parser is leaving spaces between two encoded-word tokens in place, where it must remove them instead. And it is doing so correctly for unstructured headers, just not in get_bare_quoted_string, get_atom and get_dot_atom. Then there is Postel's law (*be liberal in what you accept from others*), and the email package already applies that principle to RFC2047 elsewhere; RFC2047 also states that "An 'encoded-word' MUST NOT appear within a 'quoted-string'." yet email._header_value_parser's handling of quoted-string will process EW sections. ---------- title: email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers -> email.parser / email.policy does not correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35547> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com