Alexander V. Lukyanov wrote: > > (Giving the BULLET a width of 2 is a bit strange, but not really wrong.) > > Well, it does not seem to match current xterm behavior, and thus leads to > strange visual results. I don't know, maybe it is an xterm problem, but the > easiest way was to substitute wcwidth.
Probably the Solaris wcwidth is made to match some Japanese terminal emulators, rather than xterm? In such terminal emulators, many characters that have width 1 in xterm are represented with width 2. U+2022 (BULLET) is designated as "ambiguous width" in Unicode 5.0.0 (ftp.unicode.org ArchiveVersions/5.0.0/ucd/extracted/DerivedEastAsianWidth.txt) therefore I don't want to consider Solaris wrong here. You have to understand that wcwidth is only an approximation because different terminal emulators behave differently. > > > BTW, why not use this one: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c ? > > > It's public domain. > > > > It has also its bugs [1]. Additionally, it's slower because it uses binary > > search rather than immediate table accesses. > > Let's measure it. > > $ time ./wcwidth-solaris > wcwidth(0x2022)=2 > > real 0m2.205s > user 0m2.200s > sys 0m0.000s > > $ time ./wcwidth-rpl > wcwidth(0x2022)=1 > > real 0m55.477s > user 0m55.350s > sys 0m0.000s > > $ time ./wcwidth-mk > wcwidth(0x2022)=1 > > real 0m1.944s > user 0m1.940s > sys 0m0.010s This is not a fair comparison: wcwidth-mk works only in UTF-8 locales, whereas wcwidth() from the system and from gnulib return the right result in all locales. The test whether the locale encoding is UTF-8 is precisely what takes up most time in the gnulib replacement. Bruno