-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Александър,
Александър Шопов wrote:
> My problem is that I am trying to POST non ASCII data to tomcat, but it
> gets recoded in ISO8859-1 interpretation of UTF-8 byte sequence.
[snip]
> in server.xml I have put:
> URIEncoding="UTF-8" in the conf/server.xml files.
Note that this only affects the character encoding used to interpret the
URL (and GET parameters, but not POST).
> 30089 pts/3 Sl 0:03 /opt/jdk1.5.0_13/bin/java
> -Dfile.encoding=UTF-8
Good to know, but might not be enough. The browser can still send the
wrong character encoding.
> <%@ page language="java" contentType="text/html;charset=UTF-8"%>
Always good to set the charset.
> However - when I change the method to GET or simply do a GET to the
> resource with parameters - everything works fine - Cyrillic gets decoded
> just fine.
This is probably because the request's (body) encoding is either wrong
or unset:
> Content-Type: application/x-www-form-urlencoded
Note that there is no charset being used, here.
The code for Tomcat's HTTP connector (in 5.5.23, which is the source I
have in front of me) delegates the detection of the request's character
encoding to the ContentType.getCharsetFromContentType method, which has
this comment:
// Basically return everything after ";charset="
// If no charset specified, use the HTTP default (ASCII) character set.
Actually, the code returns null when there is no character set
(actually, when there is no ';' in the content type).
So, it's time to turn to the servlet spec. Section 4.9 of the 2.4
specification states:
"
Currently, many browsers do not send a char encoding qualifier with the
Content-Type header, leaving open the determination of the character
encoding for reading HTTP requests. The default encoding of a request
the container uses to create the request reader and parse POST data must
be “ISO-8859-1” if none has been specified by the client request.
However, in order to indicate to the developer in this case the failure
of the client to send a character encoding, the container returns null
from the getCharacterEncoding method.
"
So, there's your ISO-8851-1 default.
Most people get around this by using the CharacterEncodingFilter that is
often discussed on this list. I believe that Spring includes an
implementation, but it's pretty easy to write yourself, too: simply
check the character encoding of the request and, if it's null (or
blank), call request.setCharacterEncoding and set it to whatever makes
sense (usually utf-8).
Hope that helps,
- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHVCcK9CaO5/Lv0PARAqflAJ0UFVTneKOmAZrCvI+yn04Cig5wmwCgmh8e
9Z5NZEerqj+UZSlrZp8xMFA=
=VpBq
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To start a new topic, e-mail: [email protected]
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]