On 02/28/2014 01:04 PM, Marco van de Voort wrote:
Moreover, will operations that use character access make sense at all if you don't know what the actual encoding is?
The administrative record of each "New Delphi string" contains the encoding type and the byte-count for each code. So "you" (the compiler and the RTL) do know it.

The "only" shortcoming in Delphi is that the handling is completely "static": - if the encoding definition of the type the string is created with is not "RAW", the encoding needs to be known at compile time (i.e. the encoding type is not allowed to be modified at run time) - if the encoding definition of the type the string is created with is "RAW", auto-conversion from this string to a non-RAW is not done.

Hence (including - but not only - for decent use on multiple OSes) an additional "fully dynamically encoded" type (I suggest to call the type of this Strings "Generic") is necessary.

(not only s[] but also
pos,delete,insert etc).   The same code can seem to behave differently
because different code-paths make the same parameter have different
encodings.
I suppose that you are right. But not only the "funny" position numbers pos(), delete(), insert() and friends use, create a problem, but also the the String type they are defined to use does:

- If using any statically encoded type for same, it is close to impossible to create decently fast programs for string manipulation (unless they by chance use the correct encoding type), as auto-conversion to and fro is invisibly introduced.

- If using the suggested dynamically encoded type, we will have problems when combining strings of different types in a code snippet that calls these functions.

I don't know if / how / to_what_extent compiler magic can help here (doing auto-conversion "when necessary" similar to when simply assigning strings of different encoding types).

In the end, I feel it would be very un-desirable but might be the only "easy" solution to go with full Delphi compatibility and handle all strings encoding but UFT16 in a very un-decent way. This would force Lazarus to provide a (Delphi compatible) LCL-API completely done with UTF16. This completely contradicts all they did in the last few years :-) .

-Michael
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to