RE: Unicode Variation Selector and Combining character

2022-06-01 Thread 荒井元成
|starts_with Best regards, -Original Message- From: Daniel Verite Sent: Wednesday, June 1, 2022 6:46 PM To: Thomas Munro Cc: 荒井元成 ; Peter Eisentraut ; PostgreSQL Hackers Subject: Re: Unicode Variation Selector and Combining character Thomas Munro wrote: > Looking aroun

Re: Unicode Variation Selector and Combining character

2022-06-01 Thread Peter Eisentraut
On 01.06.22 08:15, 荒井元成 wrote: D209007=# select char_length(U&'\+0066FE' || U&'\+0E0103') ; char_length -    2 (1 行) I expect length 1. The char_length function is defined to return the length in characters, so 2 is the correct answer. What you appear to be looking f

Re: Unicode Variation Selector and Combining character

2022-06-01 Thread Daniel Verite
Thomas Munro wrote: > Looking around a bit, it might be interesting to check if the > icu_character_boundaries() function in Daniel Vérité's icu_ext treats > IVSs as single grapheme clusters. It does. with strings(s) as ( values (U&'\+0066FE' || U&'\+0E0103'), (U&'\+00304B' || U

Re: Unicode Variation Selector and Combining character

2022-06-01 Thread Thomas Munro
On Wed, Jun 1, 2022 at 7:09 PM Thomas Munro wrote: > On Wed, Jun 1, 2022 at 6:15 PM 荒井元成 wrote: > > D209007=# select char_length(U&'\+0066FE' || U&'\+0E0103') ; > > char_length > > - > >2 > > (1 行) > > > > I expect length 1. > > No opinion here, but I did happen to see Nor

Re: Unicode Variation Selector and Combining character

2022-06-01 Thread Thomas Munro
On Wed, Jun 1, 2022 at 6:15 PM 荒井元成 wrote: > D209007=# select char_length(U&'\+0066FE' || U&'\+0E0103') ; > char_length > - >2 > (1 行) > > I expect length 1. No opinion here, but I did happen to see Noriyoshi Shinoda's slides about this topic a little while ago, comparing

Re: Unicode Variation Selector and Combining character

2022-05-31 Thread Peter Eisentraut
On 30.05.22 02:27, 荒井元成 wrote: I tried it on PostgreSQL 13. If you use the Unicode Variation Selector and Combining Character , the base character and the Variation selector will be 2 in length. Since it will be one character on the display, we expect it to be one in length. Please provide a

Unicode Variation Selector and Combining character

2022-05-29 Thread 荒井元成
Hi, I tried it on PostgreSQL 13. If you use the Unicode Variation Selector and Combining Character , the base character and the Variation selector will be 2 in length. Since it will be one character on the display, we expect it to be one in length. Please provide a function corresponding