Op Sun, 23 Nov 2008, schreef listmember:
On 2008-11-23 14:10, Daniël Mantione wrote:
Therefore, any other encoding is a waste of memory and does not gain you
any speed. For that reason, I don't see the compiler switch from 8-bit
processing either.
I nearly fully agree with you.
Except that, when a string constant needs to contain non-ASCI chars. What do
we do in these cases?
The common approach is to do nothing, no processing needs to be done. I.e.
the compiler justs passes on the bytes one by one from the source file to
the object file.
For an IDE, this is a little bit more complicated. I.e. searching for a ç
in a source file needs to find both the composed and the decomposed
variant, and in the case of UTF-8, this character can be encoded in 1, 2,
3 or 4 bytes which all need to be found. This is where UTF-16 and UTF-32
start to make sense.
Only if you need to process characters (rather than pass them on),
UTF-32 is a lot faster and simpler.
Yes. If I knew how to write this patch, I'd be working on it right now.
Unfortunately an UTF-32 string type is not on our roadmap either, so it
would have to be an user contribution.
Daniël
_______________________________________________
fpc-devel maillist - [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel