On 21 Jul 2010, at 20:47, Luis Fernando Del Aguila Mejía wrote:

> I wrote this program.
> 
> var c,d: char;
> Begin
> c:=#$00B1;
> Writeln(byte(c));
> Writeln(c);
> 
> d:=#$0080;
> Writeln(byte(d));
> Writeln(d);
> End.
> 
> When I compile on Linux, the variable "c" stores the value #$B1, and the 
> variable "d" stores the value #$80.
> But when I compile the program in Windows, "c" storage #$B1, and "d" store 
> #$3F.
> In windows, puts a question mark for the range : #$0080 to #$009F.
> 
> My questions is :
> Why put the second byte in Linux and Windows puts a question mark?

In the above program you are assigning widechar constants to (ansi)chars (if 
you want to use (ansi)chars with those values, use #$B1 and #$80 instead). 
Therefore the compiler inserts a type conversion from widechar to char, which 
will try to convert those UTF-16 characters to the current charset. If a 
character cannot be represented by the current charset (or if it's plain 
invalid), then will be replaced by a question mark.

This conversion is performed by the widestring manager. On Windows, the native 
Windows widestring manager is installed by default. On Linux the widestring 
manager makes the program dependent on the C library, which is why you have to 
use the cwstring unit to get a real widestring manager (otherwise the default 
one in the system unit is used, which only touches ASCII characters and passes 
the rest through untouched).


Jonas

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to