Re: [fpc-pascal] UnicodeString and surrogate pairs

Michael Schnell Fri, 29 Apr 2016 23:25:23 -0700

On 04/29/2016 11:09 AM, Graeme Geldenhuys wrote:


No, because UTF-8 doesn't use surrogate pairs.

Really ?

I understand that "surrogate pairs" is combining a printable character(i.e on of the nearly 2^32 UTF thingies) with another of those to becombined to a different printable thingy (/e.g. "A" plus "add two dotsabove" to crate a "Ä").

Now a series of 32-bit UTF thingies can be compressed to as well aseries of UTF8 encoded bytes or as a series of UTF16 encoded words. Bothof which usually is much shorter (measured in bytes) than theuncompressed UTF32 information.


So the UTF8 vs UTF16 issue is a lower layer of encoding.

-Michael
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] UnicodeString and surrogate pairs

Reply via email to