Sato-san Sorry for the top post, but your message would make an excellent intro to i18n in one of our developer guides.
Warner On Mon, Jul 2, 2018, 11:13 AM Hiroki Sato <h...@freebsd.org> wrote: > 後藤大地 <daichig...@icloud.com> wrote > in <459bd898-8072-426e-a968-96c1382ac...@icloud.com>: > > da> > da> > da> > 2018/07/02 15:55、Hiroki Sato <h...@freebsd.org>のメール: > da> > > da> > Eitan Adler <li...@eitanadler.com> wrote > da> > in <CAF6rxg=Zjkf6EbSgt1fBQBUDHGKWwLf= > n9zjwejh+di800k...@mail.gmail.com>: > da> > > da> > li> On 1 July 2018 at 10:08, Conrad Meyer <c...@freebsd.org> wrote: > da> > li> > Hi Daichi, > da> > li> > > da> > li> > > da> > li> > > da> > li> > I don't think code to decode UTF-8 belongs in top(1). I don't > know > da> > li> > what the goal of this routine is, but I doubt this is the > right way to > da> > li> > accomplish it. > da> > li> > da> > li> For the record, I agree. This is why I didn't click "accept" on > the > da> > li> revision. I don't fully oppose leaving it in top(1) for now as > we work > da> > li> out the API, but long term its the wrong place. > da> > li> > da> > li> https://reviews.freebsd.org/D16058 is the review. > da> > > da> > I strongly object this kind of encoding-specific routine. Please > da> > back out it. The problem is that top(1) does not support multibyte > da> > encoding in functions for printing, and using C99 wide/multibyte > da> > character manipulation API such as iswprint(3) is the way to solve > da> > it. Doing getenv("LANG") and assuming an encoding based on it is a > da> > very bad practice to internationalize software. > da> > > da> > -- Hiroki > da> > da> I respect what you mean. > da> > da> Once I back out, I will begin implementing it in a different way. > da> Please advise which function should be used for implementation > da> (iswprint (3) and what other functions should be used?) > > Roughly speaking, POSIX/XPG/C99 I18N model requires the following > steps: > > 1. Call setlocale(LC_ALL, "") first. > > 2. Use mbs<->wcs and/or mb<->wc conversion functions in C95/C99 to > manipulate characters and strings depending on what you want to > do. The printable() function should use mbtowc(3) and > iswprint(3), for example. And wcslen(3) should be used to > determine the length of characters to be printed instead of > strlen(). > > Note that if mbs->wcs or mb->wc conversion fails with EILSEQ at > some point, some of the character(s) are invalid for printing. > This can happen because command-line parameters in top(1) are not > always encoded in one specified in LC_CTYPE or LANG. It should > also be handled as non-printable. However, to make matters worse, > each process does not always use a single, same locale as top(1). > A process invoked with LANG=ja_JP.eucJP may have EUC-JP characters > in its ARGV array even if top(1) runs by another user whose LANG > is en_US.UTF-8. You have to determine which locale should be used > before doing mb->wc conversion. It is not so simple. > > 3. Print the multibyte characters by using strvisx(3) family, which > supports multibyte character, or swprintf(3) family if you want to > format wide characters directly. Note that buffer length for > strvisx(3) must be calculated by using MB_LEN_MAX. > > I recommend you to learn about I18N by reading the following > documents since this involves an I18N programming model, not just a > matter of which function should be used. While they are quite old > and contain system-specific topics, they are still useful to > understand general overview of how XPG4 and the relevant C95/C99 APIs > work: > > [1] Developer's Guide to Internationalization (801-6660) > https://docs.oracle.com/cd/E19457-01/801-6660/801-6660.pdf > > [2] Software Internationalization Guide (526225-002) > > https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c02131936 > > [3] ISO/IEC 9899:TC2 draft (p.204, Sec. 7.11 Localization) > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf > > [4] Internationalization Guide, Version 2 > ISBN: 978-0133535419 > > -- Hiroki > _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"