On Sun, Apr 17, 2005 at 10:27:33PM -0400, J. Bruce Fields wrote: > On Mon, Apr 18, 2005 at 12:39:57AM +0100, Stewart Jeacocke wrote: > > On Sun, 2005-04-17 at 19:26 -0400, J. Bruce Fields wrote: > > > > The problem is that you are not using a UTF-8 (Unicode) system locale. > > > > Run > > > > > > > > # pkg-reconfigure locales > > > > > > > > and select a Unicode locale (eg en_GB.UTF-8) as the default system > > > > locale. Log out of GNOME and log back in. > > > > > > Yeah, OK, thanks, that seems to explain the symptoms. As a practical > > > problem, it seems that most of the email and newgroups I see are using > > > iso-8859-1, so that's the only thing that seems to work as a default > > > encoding for my terminal.
At least e-mail (and probably newsgroups too) indicate which encoding there using in the headers. So a mail reader should convert from the mails locale to the terminals locale if possible (Which mutt does fine with for example this mail when in an utf-8 terminal).. That's not a problem of the terminal. > > I'm pretty sure that iso-8859-1 encoding is a subset of the Unicode > > encoding. So even when the locale is set to a Unicode encoding > > iso-8859-1 (extended ASCI) documents should still work fine (they seem > > to here). > > > > If they really don't then would you attach an example file that contains > > characters that fail to render with a Unicode locale? > > None of these: > > e with an acute accent: "é" > e with a grace accent: "è" > c with a cedille: "ç" > > show up if I choose UTF-8 in gnome-terminal. > > iso-8859-1 may be a subset of unicode in the sense that all the > characters it encodes are also in unicode, but I don't believe that the > iso-8859-1 encoding is a subset of UTF-8. I'm far from an expert on > this, though.... ASCII is a subset of UTF-8, not iso-8859-1. Characters like e with an acute accent have the 8th bit set in iso-8859-1, which for UTF-8 means that it's one of multiple bytes encoding one character. Sjoerd -- "Protozoa are small, and bacteria are small, but viruses are smaller than the both put together."