Hi, At Fri, 23 Aug 2002 13:50:19 +0200, Gerfried Fuchs wrote:
> Sorry for my late response, but I'm just questioning what the advantage > for the users might be? Those reading the english pages usually don't > know what to do with the names in kanji, those who know kanji usually > don't read the english pages anyway and/or should be happy with the > "translated" names nevertheless. There are several reasons. A. Cultural Aspects. 1. There seem certain amount of people who feel fun to know how to spell persons' names in their original and right expression, even though they cannot read them correctly. There are of course a little amount of people who can read them. 2. People who cannot read Kanji (Cyrillic, Greek, Thai, Hangul, etc) can simply ignore them because Latin transcription is written. They are harmless for them. (Latin transcriptions are sometimes more important than original expression, because they are usually used in English mailing lists like I do now.) 3. Japanese (and I imagine Chinese) people tend to want to know Japanese and Chinese names in Kanji, though most Japanese people cannot read Chinese and vice versa. Thus, Japanese (Chinese) people want to read Chinese (Japanese) names in Kanji in English web pages, respectively. I don't know how Korean people feel. (Korean people have their names in Kanji but they often write their names in Hangul.) 4. As Osamu pointed out, it is impossible to convert algorithmically from Latin transcription of Japanese names into original Kanji names. (One exception: For ISHIKAWA Mutsumi, I used Hiragana expression because he always use Hiragana in Japanese Linux communities.) B. Technical Aspects. 5. So far, in the age when Unicode is starting to be popular but not very popular yet, ASCII characters are the only characters which are truely portable in the whole world. In other words, all non- ASCII characters have some possibility not to be displayed. However, accented alphabets, which are not ASCII characters, are sometimes used in English pages. Apparently, such characters cannot be displayed by text browsers in non-Latin-script-language people such as Asians and Russians. However, we don't complain about that. From the viewpoint of equality, if ISO-8859-1 local characters are permitted, Kanji should be also permitted. Of course internationalized softwares can display both of accented alphabets and kanji. 6. About &#xxxxx; expression. It is painful to prepare such expressions. However, it is written in ASCII characters and harmless for translators (just copy and paste them). FYI, I used the following Perl script to prepare &#xxxxx; expression of Japanese names. For Russian and Chinese names, I needed more tricks. ----- #!/usr/bin/perl use Text::Iconv; $converter = Text::Iconv->new("EUC-JP", "UCS-2"); while(<>) { $string = $converter->convert($_); $len = length($string)/2; @string = split("", $string); for ($i=0; $i<$len; $i++) { $a = substr($string,0,1); $b = substr($string,1,1); $string = substr($string,2); $c = ord($a)+ord($b)*256; if ($c > 126) {print '&#',$c,';';} else {print chr($c);} } } ----- Please modify "EUC-JP" into your favorite encoding. Note that this script don't think about endian. This might not work well in big-endian systems. This script Depends: on libtext-iconv-perl package. 7. I hope more people will be aware of the fact that Asian (and other non-Latin-alphabet-language-speaking) people exist and they also use Debian. I hope people (especially, developers) will feel something about problems we have. European-language-speaking developers sometimes tend to develop softwares which cannot handle Kanji. * "Kanji", "Hanji", and "Hanja" are Japanese, Chinese, and Korean words to call Chinese-originated system of ideogram (CJK Han Ideogram). It is sometimes called "Chinese characters". --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/