Hello Alexander, I will consider you request, however do not expect it to be in the first UTF-8 release. First we get the basics right, then we start adding features.
Thank you for pointing u=out the use case, though. I had a related/similar thing in my mind, an RU system having users that choose different codepages in the same locale, which I think I have a strategy to handle. Essentially, beofre going your way, I want to improve the ENV parsing that man does today - it needs it. once that is done, I will consider what other issues are left on the table. -f > -----Original Message----- > From: Alexander E. Patrakov [mailto:[EMAIL PROTECTED] > Sent: Saturday, January 21, 2006 06:22 AM > To: 'Federico Lucifredi' > Cc: 'Jim Gifford', 'LFS Developers Mailinglist' > Subject: Re: Man 1.6 + UTF-8 > > Federico Lucifredi wrote: > (please CC: at least lfs-dev) > > >Hello Jim, Alex, > > Yes, I have plans for (1) and (2), so there should be no particular > >problem getting that fixed. I have not thought extensively of the > >interaction problems with groff, tho, so that is next on the list. for > >(3), I would like to know exactly what behavior you would like to see -- > >install time switch, commandline at invokation time, or what else. > > > > > >>3) Feature request: it would be very nice if LFS obtains a way to tell > >>Man to ignore /usr/share/man/ja/* even in Japanese locales, because the > >>system's Groff-1.19.{2,3cvs} can't format those manuals. This also > >>applies to other languages, and maybe it is better to implement as a > >>whitelist, not blacklist. This whitelist should be different for > >>printing and display purposes. > >> > >> > >> > I want a new option in man.conf. Basically, it's a way for root of a box > (which, e.g., has many Russian users but allows one German person to ssh > in and override $LANG), to say: "I have created a man setup that works > only for Russian, don't attempt to use it in other situations". The idea > is to never misformat a manual page unless the "yes --help" output would > also be misformatted. Fallback to English is much better than misformatting. > > This approach is different from the one of Man-DB where it knows good > defaults for many locales (but still misformats Chinese manuals in > zh_CN.UTF-8). > > Let's name the option "TRANSLATIONS" for now. Its value should be a > colon-separated list of the following allowable items: > > 1) language names, such as de, fr,..., that contain only letters. > 2) the special tokens "8bit" and "utf8" > 3) exact locale names, such as ja_JP. They can be distinguished from (1) > by the presence of non-letters. > > Similarly, HARDCOPY_TRANSLATIONS can list the same items. > > The distributor or whoever else creates man.conf should set these > options to the list of languages/locales where manual pages are known to > be formatted properly by *roff and other programs mentioned in man.conf. > See below for the exact meaning. > > Currently, Man looks at locale environment variables and the LANGUAGE > variable to determine the list of search paths for translated manual > pages, and picks up the first manual page that exists in the search > path. Proposed change: ignore bad manual pages, according to the > following rules: > > 1) if MB_CUR_MAX==1 in the current locale, and the special "8bit" token > is precent in TRANSLATIONS, any manual page is good. The use case is the > current -Tlatin1 setup which formats all manuals properly in 8-bit > locales if they are stored in 8-bit language-specific encodings. > 2) if the currenl locale is a UTF-8 based locale, and the "utf8" token > is present in TRANSLATIONS, any manual page is good. The use case is > RedHat's groff. > 3) if the exact current locale is listed in TRANSLATIONS, and the manual > page doesn't come from the LANGUAGE environment variable, it is good. > The use case is: format Japanese manuals with -Tnippon on Debian-patched > groff. > 4) if the manual page's language is listed as a language (not locale!) > in TRANSLATIONS, it is good. Use case 1: no -Txxxx switch, rely upon the > nroff script itself to figure out the correct device for > ISO-8859-1-encoded manuals. Use case 2: the -K switch for the new groff > (because the argument correct for Russian manual pages is wrong for > German ones). > 5) English manual pages are always good. > > BTW, I would like to avoid (4.2) and the whole need for the > administrator to hardcode the right -K ... argument, because this can be > deduced from the manual page location. The old idea about > /usr/share/man/$lang/.charset file may help, maybe it would be a good > idea to introduce the "%C" token that would expand to that, and "%K" > that would expand to "-K %C" if "%C" expands to non-empty string. The > default value for "%C" for manual pages specified with the full path is > a tough question, I will think more about it. Maybe: empty string. > > Any other manual page is bad and should be treated as a non-existing file. > > The good default would be (assuming the Debian-specific manual page > encoding, -Tlatin1 default nroff argument, and the "less -isR" pager > that is able to convert from ISO-8859-1 to UTF-8 on the fly): > > TRANSLATIONS 8bit:da:de:en:es:fi:fr:ga:gl:id:is:it:nb:nl:nn:no:pt:sv > HARDCOPY_TRANSLATIONS da:de:en:es:fi:fr:ga:gl:id:is:it:nb:nl:nn:no:pt:sv > > -- > Alexander E. Patrakov > -- http://linuxfromscratch.org/mailman/listinfo/lfs-dev FAQ: http://www.linuxfromscratch.org/faq/ Unsubscribe: See the above information page