I have a question about the Tomcat implementation of the Java Servlet Spec (2.3) with regards to request character encoding. Section 4.9 of the spec reads as follows:

SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the Content-
Type header, leaving open the determination of the character encoding for reading
HTTP requests. The default encoding of a request the container uses to create the
request reader and parse POST data must be "ISO-8859-1", if none has been
specified by the client request. However, in order to indicate to the developer in this
case the failure of the client to send a character encoding, the container returns null
from the getCharacterEncoding method.


My interpretation of this is that, if "charset" in the Content-Type header is set to something, then javax.servlet.ServletRequest.getCharacterEncoding() should return that encoding. And if that header is not set, then it would return null and use "ISO-8859-1".

However, my tests of Tomcat 4.1.18 show that, for example, I can set charset to "Big5" (and I have verified that the Content-Type header is "text/html;charset=Big5" but a call to getCharacterEncoding() returns null.

Am I misinterpreting the spec or is this a bug in Tomcat's implementation?

~Tom

Reply via email to