Paul Ishenin schrieb:

Why not apply the same to AnsiString and change all to String since
Lazarus does not work with Ansi code pages anyway?

Lazarus works with strings which have 1 byte per element. If FPC later switch default string type to UnicodeString Lazarus will suddenly get many problems.

The choice of UTF-8 (for Delphi Ansi) strings is the first incompatibility. Shouldn't we cure it, by following the new Delphi Unicode model? Otherwise another string-type incompatibility is added to the string-encoding incompatibility :-(

When FPC starts to dictate inappropriate rules[1], I see no way around a fork into an Ansi/UTF-8 and an UTF-16/Unicode branch[2], according to the break in Delphi. This would mean that the old branch has to stick with an older compiler (current release), and the new branch requires the new compiler.

[1] The FPC developers currently try to find a model that fits both needs, compatibility with Delphi *and* Lazarus - we'll have to wait for its outcome.

[2] IMO it should be possible to separate user-land strings from platform/widgetset strings. Then all components can continue to use UTF-8 internally (in talking to the widgetsets), while the user accessible strings can be of another type. With a proper choice of that internal boundary, the number of excess conversions can be kept at a minimum, as well as the required changes to the LCL code.


For example, if UTF8ToUTF16 was left to accept UTF8String I would
think it would force the parameter to have UTF-8 code page, which
would be more correct. And this is what I don't understand, how will
it break when UTF8String is left.

Compiler adds implicit codepage conversion for string arguments. I had to avoid that. The better choise would be to use RawByteString type but I it is not defined in fpc 2.4.4 which we need to support.

IMO the use of RawByteString will not help much, except for (possibly) simpler code and less overloaded procedures. Avoiding implicit conversions instead will require *fixed* string types and encodings, for different tasks with different needs. E.g. a TFileName string type will allow to eliminate all conversions, when a string is known to hold file or path names (by design). Likewise an LCLString (widget, component) type could do the same for the LCL widgetset interface. The FPC decisions about string container classes (TStrings...) will tell where to put the break line, between user and widget string types.

DoDi


--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to