On Mon, 19 Jul 2004, Henry Nelson wrote: > On Sat, Jul 17, 2004 at 01:28:33AM +0900, [EMAIL PROTECTED] wrote: > > On Sun, 11 Jul 2004, Henry Nelson wrote: > > > > http://www.feyrer.de/JP/ -> * [4]English <-> Japanese Dictionary... > > > > > > If you're a friend of Hubert's, ask him to remove the extra charset meta > > > at the top of his page: > > > > > > <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> > > > <html> > > > <head> > > > <meta http-equiv="Content-Type" content="text/html; charset=euc-jp"> > > > > More precisely, ask him to remove the charset in HTTP header. > > First META line is in the HTTP header. > > > > Henry, please add this line to your lynx.cfg, then you should never > > see the extra charset meta in the downloaded file. > > > > PREPEND_[CHARSET]_TO_SOURCE:FALSE > > So it is Lynx that is prefixing the "extra" charset=iso-8859-1 META at > the top of the page. Thanks for correcting me on that point. Also, > apologies to Thorsten for my having added to the confusion. > > BUT, now I'm more curious than ever. Am I right to continue to assume > this is a case of misconfiguration of the server?
Yes, I think so. The charset in HTTP header has a top priority to determine the charset. The page is written in euc-jp but the http header indicats the charset is iso-8859-1. ref: http://www.w3.org/TR/html401/charset.html#h-5.2.2 | To sum up, conforming user agents must observe the following | priorities when determining a document's character encoding (from | highest priority to lowest): | 1. An HTTP "charset" parameter in a "Content-Type" field. | 2. A META declaration with "http-equiv" set to "Content-Type" and a | value set for "charset". | 3. The charset attribute set on an element that designates an | external resource. > To have Lynx render > the page "http://www.feyrer.de/JP/" correctly (at least on my system) > the charset meta must be the one in the header, "charset=euc-jp", not > the one Lynx prefixes, "charset=iso-8859-1". After downloading the page > with Lynx, either deleting the META that Lynx prefixes, or editing it to > "euc-jp", fixes the rendering of the Japanese. > > Is there a bug in Lynx? No, I believe. > Specifically, what should "Assumed document > character set" in the "Display and Character Set" section of the O)ptions > Menu do? Nothing in this case. > If I change it from "iso-8859-1" to "euc-jp" there is no change > in the rendering of the page; it is still garbled. Shouldn't that be a > manual override that would allow Lynx to render the page correctly? No, in this case. ASSUME_CHARSET has an effect only when no charset is specified explicitly. # Please refer the ASSUME_CHARSET section in lynx.cfg. > I ask because (at least my Japanized edition of) MSIE has a way to > correct the display by manually chosing "Japanese (EUC)" under > "Encoding(D)" in the "Display(V)" pull-down. It would be nice if Lynx > could do that, too. Building lynx with KANJI_CODE_OVERRIDE in userdefs.h, you may be able to chose the document charset from AUTO/SJIS/EUC with higher priority than charset in HTTP header. Though I've not tested for some years. -- Takeshi Hataguchi E-mail: [EMAIL PROTECTED] _______________________________________________ Lynx-dev mailing list [EMAIL PROTECTED] http://lists.nongnu.org/mailman/listinfo/lynx-dev
