Re: utf8 vs. latin1

2005-01-02 Thread Werner LEMBERG
> I thought that you had once told me that that Latin1 is a subset of > UTF-8. Latin1 is a subset of *Unicode*, not of UTF-8. Any character code >= 0x80 is represented with two or more bytes in UTF-8. > Or should we be using a different unicode->bytes layout scheme? Where exactly is the proble

Re: utf8 vs. latin1

2005-01-01 Thread Chris Sawer
On 1 Jan 2005, at 15:22, Han-Wen Nienhuys wrote: I thought that you had once told me that that Latin1 is a subset of UTF-8. This is not correct. One has to be careful to distinguish between a character set (ASCII / Latin1 / Unicode) and a mapping (encoding) used to represent text written using a

utf8 vs. latin1

2005-01-01 Thread Han-Wen Nienhuys
Hi Werner, I thought that you had once told me that that Latin1 is a subset of UTF-8. However, when I save a file as Latin1 and UTF8 under emacs, then the results differ, and latin1 chars are also saved as double bytes. Am I missing something? Did you mean that Latin1 is a subset of Unicode? Or