> I thought that you had once told me that that Latin1 is a subset of
> UTF-8.
Latin1 is a subset of *Unicode*, not of UTF-8. Any character code
>= 0x80 is represented with two or more bytes in UTF-8.
> Or should we be using a different unicode->bytes layout scheme?
Where exactly is the proble
On 1 Jan 2005, at 15:22, Han-Wen Nienhuys wrote:
I thought that you had once told me that that Latin1 is a subset of
UTF-8.
This is not correct. One has to be careful to distinguish between a
character set (ASCII / Latin1 / Unicode) and a mapping (encoding) used
to represent text written using a
Hi Werner,
I thought that you had once told me that that Latin1 is a subset of
UTF-8. However, when I save a file as Latin1 and UTF8 under emacs,
then the results differ, and latin1 chars are also saved as double
bytes. Am I missing something? Did you mean that Latin1 is a subset
of Unicode? Or