-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

David,

On 4/18/14, 2:57 PM, David Wall wrote:
> 
> On 4/17/2014 7:50 AM, Christopher Schultz wrote:
>> I'll take a look at the code to see if maybe we can conditionally
>> log something somewhere when we get a 400 error. You can probably
>> get information about it by enabling DEBUG logging on the
>> component that throws the 400 error, but you'll likely get a huge
>> amount of output in that log file, which you obviously don't
>> want.
> 
> Our first remedial action was to to add URIEncoding="UTF-8" to our
> HTTP and HTTPS connectors defined in server.xml, as well as added a
> character encoding filter to our webapp's web.xml:
> 
> <filter> <filter-name>SetCharacterEncodingFilter</filter-name> 
> <filter-class> 
> org.apache.catalina.filters.SetCharacterEncodingFilter 
> </filter-class> <init-param> <param-name>encoding</param-name> 
> <param-value>UTF-8</param-value> </init-param> </filter>
> 
> <filter-mapping> 
> <filter-name>SetCharacterEncodingFilter</filter-name> 
> <url-pattern>*</url-pattern> </filter-mapping>
> 
> We've not see any issues since.  Could this have played a role
> somehow in resolving this, or just coincidence for now?

I've never seen a request generate a 400 error due to a badly-encoded
body... you just end up with broken characters. So the
SetCharacterEncodingFilter probably didn't change anything.

But changing to UTF-8 URI encoding could have an effect: if your pages
are declared to be in UTF-8, then more and more clients these days are
automatically using UTF-8 to encode characters before putting them
into the URL. If you were expecting ISO-8859-1 (the default) and the
client was sending UTF-8, you could easily have problems like this.

I'll repeat what I said earlier: having non-US-ASCII stuff appear in
your URL is still pretty risky these days. Try to find places where
you do that and switch to HTTP POST if you can.

> Do you know if using the above setup we can remove our own "JSP
> page bean" init code shown below that sets the character encoding
> like we have now?  Seems like our code below was trying to
> accomplish what I believe the SetCharacterEncodingFilter is now
> doing for every request (not just those that reach our JSPs).
> 
> try { if ( request.getCharacterEncoding() == null ) 
> request.setCharacterEncoding(CHARSET_UTF_8); 
> response.setCharacterEncoding(CHARSET_UTF_8); } catch(
> UnsupportedEncodingException e ) { app.warning("PageBean.init() -
> Failed to set request/response to UTF-8 encoding."); }

Yes. If you use the SetCharacterEncodingFilter (which I recommend you
do), then you will no longer need this hack in your JSPs.

> Our JSPs already specify charset=UTF-8 in the content type and/or
> HTML meta tag for Content-Type.

Right: that will set the /response/ encoding. I'm thinking that your
clients were probably using UTF-8-encoded URLs and that was tripping
you up. URIEncoding="UTF-8" probably did the trick.

> We're keeping our fingers crossed!

I think you'll be okay from here on out. Let us know.

>>> It could also be possible that a browser is
>>> incorrectly-formatting something. Do you make extensive use of
>>> cookies> Do you ever store anything in a cookie name or value
>>> that isn't in US-ASCII? If so, you might have some edge cases
>>> where the overwhelming majority of your users are find but some
>>> folks with Greek names or whatever step-over into non-US-ASCII
>>> and hit some edge cases with either the browser or Tomcat
>>> itself.
> 
> Aside from the session cookie for JSESSIONID done by Tomcat, we 
> generally only have one cookie that is optional for those who want
> to save their login email address.  For these, we are using 
> java.net.URLDecoder.decode/java.net.URLEncoder.encode of the email 
> address string specifying the UTF-8 charset param to the
> encoder/decoder.

Perfect. You could also base64 encode it if you wanted to. Cookie
values can get ugly, but since you are using URLEncoder (and
specifying an encoding!), everything that actually goes into the
cookie value is US-ASCII so you should be all set.

> Does that sound like it could be an issue for these edge cases you
> are talking about since my impression is that such encoding would
> ensure US-ASCII was the cookie value?

Nope, you are doing the right thing for your cookies.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJTUXeOAAoJEBzwKT+lPKRYxPEQAKn/x8wfAiBWXpaN3hlJwa5y
oceMNPPkQx94AGcy9zGaH/5PnkkS5mRlnf02srfrG/wg/RmKKfSj0dl9OpvLzdVF
ZCCEP7Wyxr2yQDPmwGfSZD/kgIlhkUTqGO9tlUVc3lvAhVtF0mih72521j+vwWDj
ampRInly9+p98w6SwCL3TNkVFy1NIyneg7lYqZ0jDW9txSLwUP+Rad9NTJpgTGGg
XuwVRiVr+HbrWwxiBjIkgKfn6KATJcVHzjqeYzBLqfRP9EOR14O/RkWjhNm/W1uv
Yu1KwzQ6jYd2e4+WJS9Xpzk25uRWGIES7AW2WKpua14u4F0WtoiTvH+3fCqB/Grw
fiKV+dLr9MU7g1lQwRi2otUxlNWOQP+6tzPwDs/oEk3lilT+iJdUZEXwMIoymajx
ws8v4UR0j9/zcpnD3mwg0wQNwmLPDFDbJp7prbOKhX36VbzAcM4P/gANFzeE4Pq7
DpxxeSbWQmETlMpzScEXZ7PehxSUjNjXk3EhFF+VxUUDq5FRt5z58eIhWjR4NfvS
ZwNueNXXrHhgsVqGh93pXqn5E7m2/QNOm/XkKEk04+AqgazQFJak+3D4xyvK/rjE
/s50Awkn1WtJuy9aVTzjRC2utVBgIJDMBk7Mep01BkRiKeu0Ak9cDN8CbSOVgU/V
ptwEmQjNkTns4cKVFzaA
=dIge
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to