At Thu, 1 Sep 2022 15:00:38 +0700, John Naylor <john.nay...@enterprisedb.com> wrote in > On Thu, Sep 1, 2022 at 2:13 PM Pavel Stehule <pavel.steh...@gmail.com> wrote: > > problem is in bad width of invisible char 200E > > I removed this comment in bab982161e since it didn't match the code. > I'd be interested to see what happened after v12. > > - * - Other format characters (general category code Cf in the Unicode > - * database) and ZERO WIDTH SPACE (U+200B) have a column > width of 0. > > UnicodeData.txt has this: > > 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;; > 200C;ZERO WIDTH NON-JOINER;Cf;0;BN;;;;;N;;;;; > 200D;ZERO WIDTH JOINER;Cf;0;BN;;;;;N;;;;; > 200E;LEFT-TO-RIGHT MARK;Cf;0;L;;;;;N;;;;; > 200F;RIGHT-TO-LEFT MARK;Cf;0;R;;;;;N;;;;; > > So maybe we need to take Cf characters in this file into account, in > addition to Me and Mn (combining characters).
Including them into unicode_combining_table.h actually worked, but I'm not sure it is valid to include Cf's among Mn/Me's.. > diff --git a/src/common/unicode/generate-unicode_combining_table.pl > b/src/common/unicode/generate-unicode_combining_table.pl > index 8177c20260..7030bc637b 100644 > --- a/src/common/unicode/generate-unicode_combining_table.pl > +++ b/src/common/unicode/generate-unicode_combining_table.pl > @@ -25,7 +25,7 @@ foreach my $line (<ARGV>) > my @fields = split ';', $line; > $codepoint = hex $fields[0]; > > - if ($fields[2] eq 'Me' || $fields[2] eq 'Mn') > + if ($fields[2] eq 'Me' || $fields[2] eq 'Mn' || $fields[2] eq 'Cf') > { > # combining character, save for start of range > if (!defined($range_start)) By the way I was super annoyed that it was super-hard to reflect the changes under src/common to the final binary. There are two hops of missing dependencies and finally ccache stood in my way.. I find that Andres once meant to try that using --dependency-files but I hope we make that reflection automated even if we do define the dependencies manually.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center