Hi, Some additions.
From: Tomohiro KUBOTA <[EMAIL PROTECTED]> Subject: Re: search.debian.org is online Date: Mon, 30 Dec 2002 19:53:31 +0900 (JST) > > > Note that, if this problem is fixed, Korean people will benefit very > > > much even if the word-separation problem is not fixed. > > I don't understand. Are you saying that Korean uses two-byte characters > > but doesn't have spaces in words and should be ok now? > > The current version of Debian search site has two problems for east Asian > languages: I meant that Korean uses two-byte characters but DOES have spaces between woreds and should be ok now. (Chinese and Japanese use two-byte characters and DON'T have spaces between words.) > However, two-byte search doesn't always fail. For example, I reported > in http://lists.debian.org/debian-www/2002/debian-www-200212/msg00256.html > that I can search my name. I guess the condition when a search succeeds > or fails depends on whether the Japanese word is written in normal EUC-JP > encoding or in HTML "&#xxxx;" expression where xxxx is UTF-8 codepoint. > When the word is written in "&#xxxx;" expression, the search succeeds > while the word is written in normal EUC-JP encoding, the search fails. s/EUC-JP/ISO-2022-jp/ Note that Japanese WML sources are written either in EUC-JP or ISO-2022-JP. However, Japanesee HTML in Debian web site are all written in ISO-2022-JP. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/