[fpc-pascal] Re: Widestrings length and character iteration

Christos Chryssochoidis Wed, 09 May 2007 01:15:10 -0700


Daniël Mantione wrote:
>
> Op Mon, 7 May 2007, schreef Christos Chryssochoidis:
>
>> Daniël Mantione wrote:
>>> Not possible, a widestring is UCS-2/UTF-16.

>> I defined a widestring with 7 characters (code points), and thelength()>> function returned the value 15. Of the 7 code points of thatwidestring only

>> one of them was greater than $07FF (the maximum code point which can be

>> encoded in 2 bytes under UTF-8). When I changed that character withanother>> one with code not greater than $07FF, length() returned value 14...I also>> printed the byte values of one of the widestring's widechars, andthe values

>> printed indicated UTF-8 encoding.
>
> Yes, the program output is utf-8 on OS-X, because this is the native
> encoding for OS-X. However, widestrings are not utf-8. Can you show your
> code?
>
> Daniël
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________

> fpc-pascal maillist -[EMAIL PROTECTED]

> http://lists.freepascal.org/mailman/listinfo/fpc-pascal

OK, I figured out what happened. The source file was saved in UTF-8encoding, but I hadn't put in my source file the compiler directive{$CODEPAGE UTF8}. After including this directive in my code almosteverything worked fine: length() was returning the right number ofunicode characters, and subscripting the widestring returned the rightcharacter. But as the widechar and widestring encoding is, as you said,UTF-16, while my Mac OS X console uses UTF-8 encoding, for the outputresults to be displayed correctly I had to wrap the individual widecharsor the whole widestring with the function utf8encode(), prior to outputthem with write()...


Thanks for your help,

Christos

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

[fpc-pascal] Re: Widestrings length and character iteration

Reply via email to