[fpc-pascal] Unicode chars losing information

2021-03-07 Thread Ryan Joseph via fpc-pascal
I came across a bug which was caused but a unicode character losing information 
and narrowed it down to this. Why doesn't the chars[1] print the same character 
as appeared in the string? 

var
  chars: UnicodeString;
begin
  chars := '⌘⌥⌫⇧^';
  writeln(chars);
  writeln(chars[1]);
end.

Prints:

⌘⌥⌫⇧^
?


Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Marco van de Voort via fpc-pascal


Op 2021-03-07 om 17:21 schreef Ryan Joseph via fpc-pascal:

I came across a bug which was caused but a unicode character losing information 
and narrowed it down to this. Why doesn't the chars[1] print the same character 
as appeared in the string?

var
   chars: UnicodeString;
begin
   chars := '⌘⌥⌫⇧^';
   writeln(chars);
   writeln(chars[1]);
end.

Prints:

⌘⌥⌫⇧^
?


Probably it is not in the BMP and thus needs more position than one.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Ryan Joseph via fpc-pascal


> On Mar 7, 2021, at 9:31 AM, Marco van de Voort via fpc-pascal 
>  wrote:
> 
> Probably it is not in the BMP and thus needs more position than one.

Isn't char[1] a 2 byte wide char? Not sure I understand "more position than on" 
though.

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Marco van de Voort via fpc-pascal


Op 2021-03-07 om 17:38 schreef Ryan Joseph via fpc-pascal:



On Mar 7, 2021, at 9:31 AM, Marco van de Voort via fpc-pascal 
 wrote:

Probably it is not in the BMP and thus needs more position than one.

Isn't char[1] a 2 byte wide char? Not sure I understand "more position than on" 
though.


Yes it is. And there are about 1114000 unicode codepoints, or about 17 
times what fits in a 2-byte wide char.


https://en.wikipedia.org/wiki/Code_point

https://en.wikipedia.org/wiki/UTF-16


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Ryan Joseph via fpc-pascal


> On Mar 7, 2021, at 10:11 AM, Marco van de Voort via fpc-pascal 
>  wrote:
> 
> 
> Yes it is. And there are about 1114000 unicode codepoints, or about 17 times 
> what fits in a 2-byte wide char.
> 
> https://en.wikipedia.org/wiki/Code_point
> 
> https://en.wikipedia.org/wiki/UTF-16

I thought unicode strings "just worked" but maybe that's UTF-8 and the 
character I want is maybe UTF-16. What are you supposed to do then? 
UnicodeString knows how to print the full string so all the data is there but I 
can't index to get characters unless I know their size.

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Ryan Joseph via fpc-pascal


> On Mar 7, 2021, at 10:21 AM, Ryan Joseph  wrote:
> 
> I thought unicode strings "just worked" but maybe that's UTF-8 and the 
> character I want is maybe UTF-16. What are you supposed to do then? 
> UnicodeString knows how to print the full string so all the data is there but 
> I can't index to get characters unless I know their size.

Since this looks like it could be complicated here is what I was actually 
trying to do with the FreeType library. This works for ASCII but broke down 
with those unicode chars. I'm confused now because you say the character are 
more than 2 bytes so I don't know what the actual size of an element is.

  for glyph in '⌘⌥⌫⇧^' do
FT_Load_Char(m_face, ord(glyph), FT_LOAD_RENDER);

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Nikolay Nikolov via fpc-pascal


On 3/7/21 7:21 PM, Ryan Joseph via fpc-pascal wrote:



On Mar 7, 2021, at 10:11 AM, Marco van de Voort via fpc-pascal 
 wrote:


Yes it is. And there are about 1114000 unicode codepoints, or about 17 times 
what fits in a 2-byte wide char.

https://en.wikipedia.org/wiki/Code_point

https://en.wikipedia.org/wiki/UTF-16

I thought unicode strings "just worked" but maybe that's UTF-8 and the 
character I want is maybe UTF-16. What are you supposed to do then? UnicodeString knows 
how to print the full string so all the data is there but I can't index to get characters 
unless I know their size.


It depends on what you mean by "just working". UnicodeString is an 
UTF-16 encoded string and a WideChar is just a UTF-16 code unit. Both 
UTF-8 and UTF-16 are variable length encodings. UTF-16 is just more 
simple to decode. Note also that, even though a single Unicode codepoint 
might need two UTF-16 code units (i.e. WideChars), that is still not 
enough to represent what users perceive as a character. There are also 
plenty of Unicode combining characters. What most users perceive as a 
character is actually called an Extended Grapheme Cluster and is 
actually a sequence of Unicode code points. There's an algorithm (an 
enumerator) that splits a string into grapheme clusters, and that's 
implemented in FPC trunk in the GraphemeBreakProperty unit. It 
implements this algorithm:


http://www.unicode.org/reports/tr29/

This was done by me for the Unicode Free Vision port in the unicodekvm 
SVN branch, but it was already committed to trunk (the rest of the 
Unicode Free Vision still isn't), because it's a new unit that is 
relatively self-contained and provides new functionality (so, won't 
break existing code) that wasn't provided by the RTL before.


Note that normally, most programs wouldn't actually need to split a 
string into grapheme clusters, unless they implement something like a UI 
toolkit or a text editor or something of that sort. That's why it was 
needed for the Unicode Free Vision.


Nikolay

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Bart via fpc-pascal
On Sun, Mar 7, 2021 at 5:31 PM Marco van de Voort via fpc-pascal
 wrote:

> Probably it is not in the BMP and thus needs more position than one.

Length(Char) is 5 according to fpc, I see 5 "graphemes", which suggest
that all of them fit into 1 WideChar?

-- 
Bart
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Unicode chars losing information

2021-03-07 Thread Marco van de Voort via fpc-pascal


Op 2021-03-07 om 22:26 schreef Bart via fpc-pascal:

On Sun, Mar 7, 2021 at 5:31 PM Marco van de Voort via fpc-pascal
 wrote:


Probably it is not in the BMP and thus needs more position than one.

Length(Char) is 5 according to fpc, I see 5 "graphemes"


Indeed:

.Ld1$strlab:
    .short    1200,2
    .long    -1,5
.Ld1:
    .short    8984,8997,9003,8679,94,0

On win32 a quick test is hard since displaying unicode in the terminal 
is hard.



But a write for "widechar" is called:

   movl    U_$P$PROGRAM_$$_CHARS,%eax
    movw    (%eax),%cx
    movl    %ebx,%edx
    movl    $0,%eax
    call    fpc_write_text_widechar

so it should be ok then.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] Cannot write datetime field on sqlite3 database on ARM

2021-03-07 Thread Toru Takubo via fpc-pascal

Hi,

I am developing my app on Windows and building apps for other
platforms by using cross compiler. Now I have a problem only
occurred on Linux ARM.

The problem is that it cannot write datetime field on sqlite3
database. It can read/write other fields like int, varchar
or blob, but always write zero in datetime (maybe float as well)
field.

Does anyone have an idea about this issue? I am not sure it is
fpc issue, but better to report bug?

My observations are as follows:

1. I work with Lazarus 2.0.12/FPC 3.2.0 release version.
2. Target machine is Raspberry Pi OS on Raspberry Pi 3 Model B V1.2.
3. My app consists of sqlite3conn and sqldb unit.
4. The problem occurred on Linux ARM. It does NOT on Windows i386/x86_64,
 Linux i386/x86_64 and Linux AArch64.
5. I installed "DB Browser for SQLite" on Raspi as a reference. It can
 write datetime field normally. My app can read it.

Toru

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Cannot write datetime field on sqlite3 database on ARM

2021-03-07 Thread Michael Van Canneyt via fpc-pascal



On Mon, 8 Mar 2021, Toru Takubo via fpc-pascal wrote:


Hi,

I am developing my app on Windows and building apps for other
platforms by using cross compiler. Now I have a problem only
occurred on Linux ARM.

The problem is that it cannot write datetime field on sqlite3
database. It can read/write other fields like int, varchar
or blob, but always write zero in datetime (maybe float as well)
field.

Does anyone have an idea about this issue? I am not sure it is
fpc issue, but better to report bug?


It sounds like a floating point problem. As you probably know, a TDateTime
type is actually a double type. Did you try with a float value ?

The DB explorer tools probably just use strings to read/write from the
database, so they will not be bothere by such things, but FPC stores dataset
values in 'native' formats in memory.

I don't know what to advise to further investigate the issue, One thing to
try would be to test whether normal float arithmetic or date arithmetic works.
If not, then the compiler people will need to give more advice.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal