Colin Watson <[EMAIL PROTECTED]> writes: > Right. Here's an update; I think I've captured most of the discussion in > the thread so far. The following patch could in principle be applied > now, given seconds. Wordsmithing welcome, as I'm aware that this is a > rather dense recommendation; I'm also looking for seconds for this > proposal.
This proposal and patch looks good to me, although I'd prefer to see a few more seconds before I queue it up for applying to Policy 3.7.4. > I'm still open to whether new-world-order pages should go in > /usr/share/man/LL.UTF-8 or just /usr/share/man/LL. Pros for LL.UTF-8: > > * Non-compliant implementations (I'm guessing xman, yelp, etc.) will > display English manual pages rather than misencoded garbage. This > might not be such a big deal for European languages, but for e.g. > Japanese I suspect most people would prefer English to the spew you > get by trying to interpret UTF-8 as EUC-JP. I'd rather fix the other implementations, frankly. All of Debian is moving towards UTF-8, as is all of the rest of the Linux world, and I'd rather not leave transitional measures around forever. > * Determining progress towards universal UTF-8 encoding can trivially > be done by scanning Contents files rather than having to unpack the > archive and run iconv over everything. Yeah, but we already have an unpacked version of the archive available in the lintian lab, so doing this isn't too bad. > * In the event that we later want to migrate to yet another > "universal" encoding that can't be automatically distinguished from > UTF-8, we already have the encoding name right there and migration > will be straightforward. (I think this is an unlikely scenario.) Yes, this seems extremely unlikely to me. UTF-8 isn't perfect, but it seems to have reached the "good enough" level that people will work around its flaws rather than replace it with something else. > I think I am increasingly leaning towards just using /usr/share/man/LL, > seeing as man has to try decoding pages there as UTF-8 first anyway, but > please comment if you care. I agree with this position. > Unfortunately 2.5.0 wasn't quite enough. Aside from a couple of stupid > bugs (mostly fixed now), it turns out that we need an extra feature to > allow debhelper to produce UTF-8 versions of manual pages without > needing the source encoding to be explicitly specified, by guessing the > encoding in the same way that man does: > > http://lists.debian.org/debian-i18n/2007/10/msg00063.html > > I committed this feature to my development trunk earlier today, and will > be working on a 2.5.1 release over the next couple of weeks. After that > I'll send Joey a patch for debhelper. It sounds like the same feature could be used by other man implementations that currently can't deal with UTF-8. The transition plan looks good to me. -- Russ Allbery ([EMAIL PROTECTED]) <http://www.eyrie.org/~eagle/> -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]