Mark Thomas wrote:

I tend to use the following as a starting point to check my config is OK. It is also useful to compare headers etc for your application against the headers from this simple test case.

http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q4

This is a bit outside the scope of this thread, but as someone confronted with this kind of character sets issues in the web all the time, I feel I have to say that the comment at the beginning of that example can be misleading, and in my view should be taken out.

It is of a nature to induce people into doing things they should not, and which would always bite them back in the end. (For the same reason, I believe that all the methods or parameters dealing with "URI encoding" should be banned).

I can make a long case, but the summary is : don't use GET with forms, if you want to have any luck with applications that may have to handle input characters other than US-ASCII (as all web applications will have to, sooner or later; think of smileys). The situation is already confusing enough with POSTed forms, without adding extra problem sources.

The HTML 4.01 spec (and, I suspect, the XHTML also) mentions this as follows, in the same RFC, same section :

Note. The "get" method restricts form data set values to ASCII characters. Only the "post" method (with enctype="multipart/form-data") is specified to cover the entire [ISO10646] character set.

(http://www.w3.org/TR/html401/interact/forms.html#submit-format
17.13.4 Form content types )
Also see RFC3986.

André

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to