Christopher Schultz wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bernd,

On 6/7/2011 2:23 PM, Lentes, Bernd wrote:
Christopher Schultz wrote:
How did you do it? If you use <META HTTP-EQUIV="Content-Type"
CONTENT="text/html" />, it should override any Content-Type
sent in the HTTP response headers
Yes, we used this. But 
http://de.selfhtml.org/html/kopfdaten/meta.htm#zeichenkodierung (unfortunally 
only in german) says
"Im Konfliktfall, also wenn der Webserver im HTTP-Header eine hiervon abweichende Angabe 
sendet, wird üblicherweise die Angabe des HTTP-Headers verwendet.", which means that, if you 
have the META in the HTML-file and also the content-type in the HTTP-Header, mostly the HTTP-Header 
"wins".

You're right. I had it wrong: the HTTP header overrides the content of
the document.

Well, it /should/.  According to the HTTP RFC.
However, many IE versions (which unfortunately is still the most-widely used browser in corporate environments) don't give a damn about the Content-type sent by the server, if it conflicts with their own sniffing of the content.
http://lmgtfy.com?q=ie+and+content-type


Our developers try now to use the
response.setContentType("text/html"); method to configure the
content-type in the HTTP-Header.
This is the proper way to do things. Using <META> does not hurt.

.. at least if both are consistent.

Meanwhile, and as long as your developers are fixing this, you may want to suggest to them that they also add a character set indication to the Content-type, like :

Content-type: text/html;charset=UTF-8

using for example : response.setContentType("text/html; charset=UTF-8");

(if of course the pages you do send back are encoded using that character 
set/encoding).
And also add it to your <meta> tag, like so :
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8" />

That will save you other problems down the line, if any of these pages can also submit any data back to the server, like in
<input name="STADT" value="München">

So to maximise your chances of everything working correctly in a country where not everyone speaks only English, the following elements should agree :
- the type (text/html) and charset indicated by the server in the Content-type 
header
- the type and charset indicated in the <meta> tag in the page itself
- the way the page itself was created on the server (with a UTF-8 aware editor, and saved as UTF-8 without BOM)
In addition, if the page contains a <form> tag, make sure it has the following 
attribute :
<form .... accept-charset="UTF-8">

The reason for all the above is that HTTP and HTML for historical reasons tend to default to ISO-8859-1 as a character set, while everything to do with Java (like Tomcat) tends to default to Unicode/UTF-8. And by not being very precise and consistent, you always run the risk of mixing them up, which for languages like German leads to very difficult to debug data corruption problems, the least of which is losanges with "?" in them in your pages, instead of umlauts.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to