Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-30 Thread Graeme Geldenhuys
On 2016-04-30 11:32, Martin Schreiber wrote: > One could say that utf-8 has surrogate pairs, surrogate triplets and > surrogate > quads. No, don't confuse the point. As per the Unicode Standards definition of "surrogate pairs", UTF-8 and UTF-32 don't have surrogate pairs. Regards, Graeme __

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-30 Thread Martin Schreiber
On Saturday 30 April 2016 12:12:35 Graeme Geldenhuys wrote: > > Anyway, I was refering to surrogate pairs (applies to UTF-16 only) > One could say that utf-8 has surrogate pairs, surrogate triplets and surrogate quads. Martin ___ fpc-pascal maillist -

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-30 Thread Graeme Geldenhuys
Hello Michael, On 2016-04-29 at 11:23 you wrote: > > No, because UTF-8 doesn't use surrogate pairs. > Really ? Yes. > those to be combined to a different printable thingy (/e.g. "A" plus > "add two dots above" to crate a "Ä"). No, that is something totally different and not what I was talkin

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-29 Thread Sven Barth
Am 30.04.2016 08:24 schrieb "Michael Schnell" : > > On 04/29/2016 11:09 AM, Graeme Geldenhuys wrote: >> >> >> No, because UTF-8 doesn't use surrogate pairs. > > Really ? > > I understand that "surrogate pairs" is combining a printable character (i.e on of the nearly 2^32 UTF thingies) with another

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-29 Thread Michael Schnell
On 04/29/2016 11:09 AM, Graeme Geldenhuys wrote: No, because UTF-8 doesn't use surrogate pairs. Really ? I understand that "surrogate pairs" is combining a printable character (i.e on of the nearly 2^32 UTF thingies) with another of those to be combined to a different printable thingy (/e.g.

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-29 Thread Graeme Geldenhuys
On 2016-04-28 09:05, Michael Schnell wrote: > Would that necessarily be an UTF-8 issue ? No, because UTF-8 doesn't use surrogate pairs. In this instance the string is of type UnicodeString, thus UTF-16 encoded. Now I could internally assign that to a UTF8String type, but in this case I wanted to

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-28 Thread Michael Schnell
On 04/27/2016 04:36 PM, Graeme Geldenhuys wrote: Does FPC's RTL (or FCL) include a function to check for UTF-16 surrogate pairs? Would that necessarily be an UTF-8 issue ? -Michael ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://l

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-27 Thread Graeme Geldenhuys
On 2016-04-27 16:24, Marco van de Voort wrote: > Same as Delphi, character.tcharacter.issurrogate() or > character.issurrogate() Ah, thank you very much. Regards, Graeme ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepa

Re: [fpc-pascal] UnicodeString and surrogate pairs

2016-04-27 Thread Marco van de Voort
In our previous episode, Graeme Geldenhuys said: > Does FPC's RTL (or FCL) include a function to check for UTF-16 surrogate > pairs? I'd be very surprised if there isn't, but I have yet to find it > in the documentation or source code I searched. Same as Delphi, character.tcharacter.issurrogate()

[fpc-pascal] UnicodeString and surrogate pairs

2016-04-27 Thread Graeme Geldenhuys
Hi, Does FPC's RTL (or FCL) include a function to check for UTF-16 surrogate pairs? I'd be very surprised if there isn't, but I have yet to find it in the documentation or source code I searched. I need to process one "character" (loosely based on what you see on the screen) at a time while calcu