In our previous episode, Juha Manninen said:
> > Widestring (refcounted 2-byte type) , it is the ansistring type (1-byte
> > type) that gets codepage support.
> 
> UTF-16 needs codepages, too.
> I think only the 4-byte char type (is it UTF-32) would solve all encoding 
> problems. 

codepage<>encoding

> All characters of all languages fit into 2^32 space.

character<>codepoint.

Anyway surrogates etc is a different problem of processing true unicode
spec, and the bits that UTF32 solves are the lesser ones. (and at the
expense of speed and memory)

The document is a bit messy (since it was conceived before Delphi 2009 came
out, and later updates for that), but 

http://www.stack.nl/~marcov/unicode.pdf

has some details.
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to