On 5 janv. 05, at 13:30, J. Patterson Waltz III wrote:

in article [EMAIL PROTECTED], Josh Cronemeyer at [EMAIL PROTECTED] wrote
on 4/01/05 18:02:


J. Patterson Waltz III wrote:

Merci Guillaume,

I had actually seen the references to the Filter solution in the comments of
Struts bug 16191 in Bugzilla:
http://issues.apache.org/bugzilla/show_bug.cgi?id=16191


I will try that out and see if it improves my results.

I remain perplexed at what changes between versions 1.1 and 1.2.4 of Struts
caused it to become susceptible to this problem. Any ideas on that?


Patterson

P.S. - I know how to view the headers of replies sent from the server to the
browser, but am not sure how to get at those sent from the browser to the
server, to make sure that they are indeed UTF-8. Any suggestions?


in article [EMAIL PROTECTED], Guillaume Cottenceau at [EMAIL PROTECTED]
wrote on 4/01/05 16:40:


Most probably the browser is sending data in UTF-8 but doesn't say so with
charset= in the Content-Type header. If you're confident enough that the
browsers will send UTF-8 (which should be the case if they are encoded in
UTF-8 and you use accept-charset in the forms), you can use a filter which
forces the HTTP request to be seen as UTF-8 in input (for example
filters/SetCharacterEncodingFilter which is bundled with tomcat)[1].




One way woulb be to set up a proxy that your browser uses to connect to the
web. I used to use web scarab. http://www.owasp.org/software/webscarab.html


-josh

I still haven't figured out the solution to my problem, but I have figured
out one *cause* of it.


After using web scarab at Josh's suggestion to eavesdrop on the
conversations between my browser and app server, I've figured out what has
changed between Struts 1.1 and 1.2.4:


Struts 1.1 in spite of the the <%@ page pageEncoding="UTF-8"
contentType="text/html;charset=UTF-8" language="java" %> directive, Struts
was in fact sending back pages encoded in ISO-8859-1, and the resulting form
submissions were URL-encoded (and decoded) in the same format.


Struts 1.2.4 the directive is now being followed, and pages are sent out in
UTF-8 encoding. However, for some reason, the form data is not being decoded
as UTF-8, but still as ISO-8859-1.


This suggests to me that if I were to remove the contentType directive in
Struts 1.2.4, it would fall back to the default ISO-8859-1 encoding, and all
would work as before: except that my web application would then be
constrained to using only characters representable in that encoding.



More results of my testing: I was indeed able to restore the proper encoding of characters submitted to my web application by removing both the @ page contentType directive *and* the corresponding <controller contentType="text/html; charset=UTF-8" /> element from the struts-confix.xml file. However, as predicted, this only enabled proper encoding of characters within the ISO-8859-1 range: double-byte languages such as Japanese were munged into a series of question marks.


I also determined that neither adding acceptCharset="UTF-8" nor enctype="application/x-www-form-urlencoded;charset=UTF-8" attributes to the <html:form> tags in my JSPs seemed to make any difference with the encoding problems. The resulting Content-Type header returned by the browser never includes the character set information. This is not particularly surprising, as the html 4.0 spec (see http://www.w3.org/TR/REC-html40/interact/forms.html#submit-format) specifies:
the "get" method restricts form data set values to ASCII characters. Only the "post" method (with enctype="multipart/form-data") is specified to cover the entire [ISO10646] character set.

Adding enctype="multipart/form-data;charset=UTF-8" worked even less well however, as Struts did not appear able to interpret form data submitted in this format: it displayed validation errors saying that required form fields were missing (although they had been submitted with complete information and were visible in the response returned by the browser).


Now, I guess I'll just have to try using the character encoding filter Guillaume recommended.


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to