It's not a bug, it's a feature ;).  Seriously, if you open a bug report for 
this, it will be closed quickly as either INVALID or as DUPLICATE to a bug 
that was closed as INVALID.

The HTTP spec specifies that header information is encoded in iso-latin-1, 
so this is what Tomcat uses by default when parsing the query-string.  If 
you want the non-default behavior, then simply set 
useBodyEncodingForURI="true" in the <Connector ... /> element of server.xml.

"Lajos Papp" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> hi,
>
> i think there is a bug at handling utf-8 encoded request parameters  sent 
> by a html form with "get" method.
> i created a simple jsp page:
> === encTest.jsp ===
> <[EMAIL PROTECTED] contentType="text/html" pageEncoding="UTF-8"%>
>
> <%
> String query = request.getQueryString();
> String queryDecoded = "-";
> if (query != null) {
>     queryDecoded = java.net.URLDecoder.decode(query,"utf-8");
> }
>
> request.setCharacterEncoding("UTF-8");
> String reqParam = request.getParameter("param");
> %>
>
> <br> query = <%= query %>
> <br> queryDecoded = <%= queryDecoded %>
> <br> reqParam = <%= reqParam %>
>
>
> <form action="encTest.jsp" method="get">
>     <input name="param" />
>     <input type="submit" value="send" />
> </form>
> === end of jsp ===
>
> When i fill out the form with some non US characters (in this case  with a
> hungarian name), the browser urlencodes it correctly which i can see
> from the url:
> http://localhost:8080/struts/encTest.jsp?param=b%C3%A9la
>
> when i decode the query string by hand:
>   queryDecoded = java.net.URLDecoder.decode(query,"utf-8");
> i get the correct string, but when i call the getParameter() method  on 
> the request:
>   request.setCharacterEncoding("UTF-8");
>   String reqParam = request.getParameter("param");
> i get a miscoded string as the request.setCharacterEncoding("UTF-8") 
> wouldn't
> be there.
>
> i checked the sourcecode of tomcat 6.0.16 and found that
> the Parameters.handleQueryParameters() does the real job, which is  called 
> by
> Request. parseParameters()
> the request has the correct encoding (utf-8) but the parameter has 2 
> different
> properties which store information about encoding: encoding and
> queryStringEncoding. in case of a "GET" the useBodyEncodingForURI is
> false, and therefore only parameters.setEncoding("utf-8") is called
> but parameters.setQueryStringEncoding("utf-8") isn't.
> so when request.parseParameters() calls 
> parameters.handleQueryParameters()
> than queryStringEncoding is still null, and of course will return 
> miscoded paramter.
>
> Do you agree that it's a bug, or i miss something?
> cheers,
> lajos
>
> === org.apache.catalina.connector.Request ===
>
>  protected void parseParameters() {
>
>         ...
>         String enc = getCharacterEncoding();
>
>         boolean useBodyEncodingForURI = 
> connector.getUseBodyEncodingForURI();
>         if (enc != null) {
>             parameters.setEncoding(enc);
>             if (useBodyEncodingForURI) {
>                 parameters.setQueryStringEncoding(enc);
>             }
>         }
>         ...
>         parameters.handleQueryParameters();
>
>         ...
>         if (!getMethod().equalsIgnoreCase("POST"))
>             return;
>
>
> === org.apache.tomcat.util.http.Parameters ===
> public void handleQueryParameters() {
>    ...
>    handleQueryParameters(decodedQuery, queryStringEncoding);
> }
>
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
> 




---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to