On Wed, 2 Jul 2014, John Hardin wrote:

On Wed, 2 Jul 2014, Philip Prindeville wrote:

 Given that it’s text/plain with an implicit charset=“us-ascii” and an
 implicit content-transfer-encoding of 7bit, the sequence &#x[0-9A-F]{4}
 doesn’t really parse into a 16-bit character, would it? That would be a
 broken MUA that made such a leap...

Nope. The content-transfer-encoding is only for the *transfer* part of the process. Once the content reaches the MUA that content can be further parsed by the MUA according to other encoding rules, such as these escape sequences for Unicode characters. That's perfectly valid. How else would you send, for example, a c-cedille in spanish text via a 7-bit-clean channel?

 Wouldn’t that normally render as the character ‘&’, ‘#’, ‘x’, etc. rather
 than the unicode16 or UTF-8 character with that hex value?

I'd only expect that in a very old MUA (i.e. that does not support Unicode), or display of the raw message content at user request.

...that said, I primarily use a text-based MUA, and it did not render Unicode glyphs for that sample...

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Of the twenty-two civilizations that have appeared in history,
  nineteen of them collapsed when they reached the moral state the
  United States is in now.                          -- Arnold Toynbee
-----------------------------------------------------------------------
 2 days until the 238th anniversary of the Declaration of Independence

Reply via email to