Hi, At Thu, 5 Jul 2001 17:36:39 +0100, David Starner <[EMAIL PROTECTED]> wrote:
> Doesn't ISO-2022-JP have a form that invokes JIS X 0208 into the upper half? > Could SJIS be used instead? No. Additional explanations about real state of Japanese encodings: There are three popular encodings for Japanese web pages -- ISO-2022-JP, Shift_JIS, and EUC-JP. ISO-2022-JP is a 7bit stateful (i.e., having a state which is changed by escape sequence) encoding while Shift_JIS and EUC-JP are 8bit stateless encodings. Web browsers sometimes have to automatically investigate the encoding of the web pages to be displayed. Since Shift_JIS and EUC-JP share many codepoints, web browsers are sometimes confused. On the other hand, ISO-2022-JP is a self-evident encoding and browsers cannot be confused. Note that new web browsers which understand <META HTTP-EQUIV="Content- Type" CONTENT="text/html; charset=foobar"> will NOT be confused by any encodings. Thus, migration into EUC-JP may be a solution. Shift_JIS can also be. (Not having JIS X 0212 is a little problem. However, 0x40-0x7e can appear for the second byte of doublebyte character. Though 0x22 is safe, the range includes 0x5c, i.e., backslash.) UTF-8 is not popular yet and some browsers may fail to display, though I think the situation will change in five or ten years. On the other hand, better wml handling may be an another (and better) solution. Though I don't know well about wml parser, I think it is possible because title for each page in the sitemap has no problem. (I.e., "ports/" item in sitemap page is broken while the title of "ports/" page is good.) Thus, I expect someone who is familiar with wml programming can find this solution. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/