Re: [OT] Basic int/char conversion question

André Warnier Thu, 01 Jan 2009 16:44:52 -0800

To Konstantin and all the others who have responded,

many thanks for all the tips, specially since this was quite a bitoff-topic.I need some time to digest the tips though, and choose the best wayaccording to the code that was dumped in my lap.

I must say that I find it a bit curious that Java does not have an easyout-of-the-box method to convert a byte to a char, with a characterfilter specifier. Something like

char mychar = toChar(int,charset) (or int.toChar(charset))
Oh well, maybe Java 7..

To Konstantin in particular :

I know that I don't lose information by converting iso-8859-2 (thinkingit is iso-8859-1) to Unicode one way, then re-converting this Unicode toiso-8859-2 (re-using the iso-8859-1 filter). I will get the same bytesin the end.The problem is that this is a servlet writing the result to the responseobject. And if I tell it to use iso-8859-1 for the response, itautomatically also sets the response Content-Type to iso-8859-1.

Which in this case is wrong, because the browser then gets confused.

And as I have found out, it is quite hard to change this Content-Typeheader after-the-fact.Even a servlet filter won't do it, because by that time the response iscommitted.Even the front-end Apache can't do it, because it won't let you changethe Content-Type header..


So my problem is in reverse :

The servlet must set the response output encoding to iso-8859-2, inorder to produce the correct Content-Type for the browser. To producecorrect iso-8859-2 from the internal Unicode string, this Unicode stringmust have the proper Unicode chars corresponding to the iso-8859-2characters I want to output.But the servlet reads those bytes as int's, and does a bunch of internaltests and manipulations on them, without taking into account that theycould be anything else than iso-8859-1.

For the same reason, I cannot just replace the InputStream by somethingthat would translate these bytes on-the-fly to Unicode chars, becausefor high iso-8859-2 bytes, it would generate internal codes that do nolonger fall into values 0-255, and that may create a problem somewheredeep in code I haven't yet looked at.

I think I have to go back to examine that code, and see how often thisStringBuffer is being used/manipulated. If not too often, I mightreplace it by a byte buffer, and do the conversion all at once each timeit is being written out.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: [OT] Basic int/char conversion question

Reply via email to