Character encoding

Tom Anderson Wed, 26 Feb 2003 11:07:59 -0800

I have a question about the Tomcat implementation of the Java Servlet Spec (2.3) with regards to request character encoding. Section 4.9 of the spec reads as follows:

SRV.4.9 Request data encoding Currently, many browsers do not send a char encoding qualifier with the Content- Type header, leaving open the determination of the character encoding for reading HTTP requests. The default encoding of a request the container uses to create the request reader and parse POST data must be "ISO-8859-1", if none has been specified by the client request. However, in order to indicate to the developer in this case the failure of the client to send a character encoding, the container returns null from the getCharacterEncoding method.

My interpretation of this is that, if "charset" in the Content-Type header is set to something, then javax.servlet.ServletRequest.getCharacterEncoding() should return that encoding. And if that header is not set, then it would return null and use "ISO-8859-1".

However, my tests of Tomcat 4.1.18 show that, for example, I can set charset to "Big5" (and I have verified that the Content-Type header is "text/html;charset=Big5" but a call to getCharacterEncoding() returns null.

Am I misinterpreting the spec or is this a bug in Tomcat's implementation?

~Tom

Character encoding

Reply via email to