I have a question about the Tomcat implementation of the Java Servlet
Spec (2.3) with regards to request character encoding. Section 4.9 of
the spec reads as follows:
SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the
Content-
Type header, leaving open the determination of the character encoding
for reading
HTTP requests. The default encoding of a request the container uses to
create the
request reader and parse POST data must be "ISO-8859-1", if none has
been
specified by the client request. However, in order to indicate to the
developer in this
case the failure of the client to send a character encoding, the
container returns null
from the getCharacterEncoding method.
My interpretation of this is that, if "charset" in the Content-Type
header is set to something, then
javax.servlet.ServletRequest.getCharacterEncoding() should return that
encoding. And if that header is not set, then it would return null
and use "ISO-8859-1".
However, my tests of Tomcat 4.1.18 show that, for example, I can set
charset to "Big5" (and I have verified that the Content-Type header is
"text/html;charset=Big5" but a call to getCharacterEncoding() returns
null.
Am I misinterpreting the spec or is this a bug in Tomcat's
implementation?
~Tom