On Sun, 1 Oct 2017, Kerim Aydin wrote:
Could someone look at the index for cases:
https://faculty.washington.edu/kerim/nomic/cases/
And tell me why Ørjan's name displays correctly in CFJ 3565,
but not in CFJ 3470?
The first case involves a quoted message including 天火狐's CJK nick, and
so probably got sent as UTF-8.
The second case has no special characters other than my own name, and for
such messages my mailer (terminal Alpine) seems to use ISO-8859-15 (a
western European charset, the revision of ISO-8859-1 with the euro sign,
iirc) for sending.
Further, if you click through to the pure text version, the
situation is reversed.
ISO-8859-15 is mostly compatible with the Windows Codepage 1252 and
ISO-8859-1 encodings, so if your pure text is served as either of those or
something similar, the ISO-8859-15 is likely to be shown correctly in a
browser.
For the other one, I can see it correctly as pure text by forcing UTF-8.
I'm guessing one uses a UTF-8 Ø, but the other uses some form
of extended ASCII? So one displays in html not text, and the
other is vice versa? And given that I get these from cutting
and pasting from email, is there any convenient way to tell
other than seeing the mistake or always having a hex editor
open?
Finding out my sending charset was surprisingly awkward because it's only
given as a multipart header (for some reason it uses multipart format even
though there is only one part), not a header of the email itself. Its
format in the raw mailbox file looks like
Content-Type: TEXT/PLAIN; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
*Sigh*, I think these days, cutting and pasting ought to convert via
Unicode to work properly.
Greetings,
Ørjan.