Here's how I think you should handle this.
// call before getting anything from the reqest request.setCharacterEncoding("UTF-8");
Then all strings will be pulled as UTF-8 which is equivalent to the work-around below.
FYI, from the Servlet 2.3 specification:
SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the Content-
Type header, leaving open the determination of the character encoding for reading
HTTP requests. The default encoding of a request the container uses to create the
request reader and parse POST data must be ТISO-8859-1У, if none has been
specified by the client request. However, in order to indicate to the developer in this
case the failure of the client to send a character encoding, the container returns null
from the getCharacterEncoding method.
If the client hasnХt set character encoding and the request data is encoded with
a different encoding than the default as described above, breakage can occur. To
remedy this situation, a new method setCharacterEncoding(String enc) has
been added to the ServletRequest interface. Developers can override the
character encoding supplied by the container by calling this method. It must be
called prior to parsing any post data or reading any input from the request. Calling
this method once data has been read will not affect the encoding.
What I don't understand is why Tomcat isn't detecting and using the Content-Type header when it IS specified. I have done tests to show that Content-Type is being set but I still am forced to do the request.setCharacterEncoding() to get things out correctly.
~Tom
On Tuesday, March 4, 2003, at 03:03 AM, Oxley, David wrote:
Cheers,
Worked a treat. Should this code be done automatically by Tomcat?
Dave.
-----Original Message----- From: Kent Degrano [mailto:[EMAIL PROTECTED] Sent: 04 March 2003 09:59 To: Tomcat Developers List Subject: RE: UTF-8 characters
try this code, Dave.
String str = new String(request.getParameter("key").getBytes("ISO-8859-
1"), "UTF-8");
-----Original Message----- From: Oxley, David [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 04, 2003 5:30 PM To: '[EMAIL PROTECTED]' Subject: UTF-8 characters
When a response to the browser is in UTF-8 encoding form responses from
said
page are giving UTF-8 characters when req.getParameter is called. i.e.
instead of getting é, I get the UTF-8 encoding é returned to our servlet.
Surely the getParameter method should return é as it does with other
encodings. Is this a bug?
Dave.
______________________________________________________________________ __
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
______________________________________________________________________ __
--- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.456 / Virus Database: 256 - Release Date: 2/18/2003
--- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.456 / Virus Database: 256 - Release Date: 2/18/2003
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
______________________________________________________________________ __
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
______________________________________________________________________ __
_______________________________________________________________________ _
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
_______________________________________________________________________ _