čt 22. 7. 2021 v 0:12 odesílatel Jacob Champion <pchamp...@vmware.com>
napsal:

> On Wed, 2021-07-21 at 00:08 +0000, Jacob Champion wrote:
> > I note that the doc comment for ucs_wcwidth()...
> >
> > >  *    - Spacing characters in the East Asian Wide (W) or East Asian
> > >  *          FullWidth (F) category as defined in Unicode Technical
> > >  *          Report #11 have a column width of 2.
> >
> > ...doesn't match reality anymore. The East Asian width handling was
> > last updated in 2006, it looks like? So I wonder whether fixing the
> > code to match the comment would not only fix the emoji problem but also
> > a bunch of other non-emoji characters.
>
> Attached is my attempt at that. This adds a second interval table,
> handling not only the emoji range in the original patch but also
> correcting several non-emoji character ranges which are included in the
> 13.0 East Asian Wide/Fullwidth sets. Try for example
>
> - U+2329 LEFT POINTING ANGLE BRACKET
> - U+16FE0 TANGUT ITERATION MARK
> - U+18000 KATAKANA LETTER ARCHAIC E
>
> This should work reasonably well for terminals that depend on modern
> versions of Unicode's EastAsianWidth.txt to figure out character width.
> I don't know how it behaves on BSD libc or Windows.
>
> The new binary search isn't free, but my naive attempt at measuring the
> performance hit made it look worse than it actually is. Since the
> measurement function was previously returning an incorrect (too short)
> width, we used to get a free performance boost by not printing the
> correct number of alignment/border characters. I'm still trying to
> figure out how best to isolate the performance changes due to this
> patch.
>
> Pavel, I'd be interested to see what your benchmarks find with this
> code. Does this fix the original issue for you?
>

This patch fixed badly formatted tables with emoji.

I checked this patch, and it is correct and a step forward, because it
dynamically sets intervals of double wide characters, and the code is more
readable.

I checked and performance, and although there is measurable slowdown, it is
negligible in absolute values. Previous code was a little bit faster - it
checked less ranges, but was not fully correct and up to date.

The patching was without problems
There are no regress tests, but I am not sure so they are necessary for
this case.
make check-world passed without problems

I'll mark this patch as ready for committer

Regards

Pavel





>
> --Jacob
>

Reply via email to