On 05.05.2017 12:01, Michael Van Canneyt via Lazarus wrote:
On Fri, 5 May 2017, Ondrej Pokorny via Lazarus wrote:
Believe me, I use it in production without any problems: I have unicode-aware TStrings, I can read files with unicode names, I can do everything with plain FPC trunk.

I am aware of this, I do it myself. But I work on Linux, where UTF8 is the norm.

So I cannot vouch for other platforms...

For now I am only on Windows and I have to say loadly: IT WORKS GREAT :)

I don't need a 100% UTF-16 Delphi-Compatible RTL for that at all.

This is the crux of the problem. Is this wanted/needed or do we stick to UTF8 ?

We claim Delphi compatibility. So IMHO we must provide a UTF-16 Delphi compatible RTL.

I write code that is compatible with FPC and Delphi 5 - 10.2 and it works fine. So you already have a Delphi-compatible RTL. The only (well documented) difference is that FPC uses single-byte string and Delphi uses 2-byte string.

The only place where you need to handle the difference is where you need the size of char (when you access string as buffer) - which is particularly low-level code:

MyStream.WriteBuffer(MyString[1], Length(MyString) * SizeOf(Char));

-> you need the extra SizeOf(Char) and not a constant (1 for fpc, 2 for unicode Delphi).

That's all. All high-level code is compatible already. Good job. I really do think it's not worth it to pollute FPC RTL with UnicodeString overloads of every function, class etc.

Better to keep 1 clean approach (UTF-8 RTL) and not confuse people with 2 approaches (UTF-8 vs UTF-16). E.g. how do you want to call the new UnicodeString-TStrings class? You have 2 options: 1.) Break compatibility to legacy FPC. (New TStrings will use UnicodeString.)
2.) Break compatibility to Delphi. (TStrings will stay with 8-byte string.)

There is no obvious solution for the problem :/

And then if you will introduce a compiler switch to change String from 1-byte to 2-bytes... Oh no, so much mess and so many variants to care about. Really, sometimes it's better to give people no options :) (Or have you already introduced the switch?)

Just stick with current utf8 approach that proved well :)

Ondrej
--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
http://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to