On Wed, 2 Jul 2014 11:37:33 -0700 (PDT) John Hardin <jhar...@impsec.org> wrote:
> Nope. The content-transfer-encoding is only for the *transfer* part > of the process. Once the content reaches the MUA that content can be > further parsed by the MUA according to other encoding rules, such as > these escape sequences for Unicode characters. I don't think so. Any MUA that tried to convert "е" to a Unicode character in a text/plain part with implicit US-ASCII charset and 7bit content transfer encoding is broken. An MUA should diplay exactly "е" in this situation. It's a different story for text/html parts, of course. > That's perfectly valid. How else would you send, for example, a > c-cedille in spanish text via a 7-bit-clean channel? With the appropriate charset and content-transfer-encoding, such as ISO-8859-1, quoted-printable, and =E7. > I would say that's more a case of those characters shouldn't be > present if the language is en-us than an encoding issue. The presence > of lots of those is either a sign that the text isn't English, or is > obfuscated. How do you reliably tell the language of the message? I would say the presence of ꯍ in a text/plain part is either a bug in spam-generating software or a researcher trying to send something to a colleague. :) Regards, David.