On Wed, 2 Jul 2014, John Hardin wrote:
On Wed, 2 Jul 2014, Philip Prindeville wrote:
Given that it’s text/plain with an implicit charset=“us-ascii” and an
implicit content-transfer-encoding of 7bit, the sequence &#x[0-9A-F]{4}
doesn’t really parse into a 16-bit character, would it? That would be a
broken MUA that made such a leap...
Nope. The content-transfer-encoding is only for the *transfer* part of the
process. Once the content reaches the MUA that content can be further parsed
by the MUA according to other encoding rules, such as these escape sequences
for Unicode characters. That's perfectly valid. How else would you send, for
example, a c-cedille in spanish text via a 7-bit-clean channel?
Wouldn’t that normally render as the character ‘&’, ‘#’, ‘x’, etc. rather
than the unicode16 or UTF-8 character with that hex value?
I'd only expect that in a very old MUA (i.e. that does not support Unicode),
or display of the raw message content at user request.
...that said, I primarily use a text-based MUA, and it did not render
Unicode glyphs for that sample...
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Of the twenty-two civilizations that have appeared in history,
nineteen of them collapsed when they reached the moral state the
United States is in now. -- Arnold Toynbee
-----------------------------------------------------------------------
2 days until the 238th anniversary of the Declaration of Independence