(CCing Daiki Ueno and the groff list, since I have some comments on the patch and this is a convenient hook for them.)
On Fri, Nov 26, 2010 at 10:51:25PM +0900, Kenshi Muto wrote: > You may already red at groff list and perhaps it is bit late for > Squeeze, but Daiki Ueno created a patch for handling CJK wide > characters nicely based on your Charclass. > http://lists.gnu.org/archive/html/groff/2010-11/msg00018.html > > I received a patch from him and tested. It worked well, particularly > for Japanese and Korean. As far as my tests, it won't break any other > languages (tested with C, fr_FR.UTF-8, nl_NL.UTF-8, and ru_RU.UTF-8.) > > Here is a debdiff against 1.20.1-10. > > It is very helpful for CJK users that this patch is merged to > Squeeze groff. Thanks for letting me know! This is definitely good to see. http://lists.debian.org/debian-devel-announce/2010/11/msg00006.html makes it painfully clear that it's too late for squeeze, but I don't mind looking at this for unstable (and somebody can always build backport packages or whatever, rather than trying to rush it in now and then realising that it's broken in some way). One thing I think is wrong about this patch is: - nroff: supply "-mja" to groff if running under Japanese locales. We should be trying to reduce the cases where Japanese is handled uniquely, particularly in code rather than in configuration such as macro files, and this change introduces one. Furthermore, relying on the locale is a problematic approach we should be trying to escape, as it makes it harder to test Japanese pages in English locales or vice-versa. The agreed approach for this kind of thing can be found in this thread (the initial patch wasn't applied, but look further down the thread for the conclusion): http://lists.gnu.org/archive/html/groff/2009-02/msg00044.html Now that Fedora has at last switched to man-db (cause for celebration in these quarters!), it can take advantage of this, and there should be no need for this change to nroff; man is the only case where I think it's particularly onerous to have to manually supply options to nroff. However, because man has to have a version test to avoid spurious warning messages, this will only work once groff 1.20.2 or newer is released, which has taken much longer than I expected when I made that change to man-db 2.5.4 to take advantage of the new 'file' warning category. Werner, is there any chance that you might be able to release 1.20.2 in the near future? It's been nearly two years since 1.20.1. Are there any specific blockers (perhaps I could help), or is it just a lack of time? I certainly understand the latter, but would really like to be able to take advantage of the change above ... Relying on wcwidth is a problem for the same reasons, I think. It means that you can only use devutf8 correctly in a UTF-8 locale. That has not historically been a requirement (and indeed IMO it was one of the major problems with the old multibyte patch), and I think we should try to avoid it becoming one because there are some practical problems with this. I have to say I agree with Werner in http://lists.gnu.org/archive/html/groff/2010-08/msg00000.html when he suggests that this would be better done some other way. Fonts aren't ideal, though, because then we'd have to have separate font files for Japanese. Perhaps you could add a new charinfo flag and set this in ja.tmac using a character class? I don't know if that design is perfect either, but my feeling is that this kind of problem is why we came up with the idea of character classes in the first place. Thanks, -- Colin Watson [cjwat...@debian.org]