Vicent, Forrest,
Thanks for the patch & review.
Could you summarize and/or expand a bit :-) ?
Also, does anyone played with the various browsers ? Is any browser
sending the charset encoding ? What format ?
I know that some browsers are encoding the URL with the same charset that
is used in the page, while some are using UTF ( there was discussion about
that somewhere).
Is it true that browsers that are using UTF ( like IE on NT ? ) do send
the body as UTF ? Do they set the Charset-Encoding header ?
I would really apreciate some info ( I don't use Windows, and I heard
there are differences between IE/Win9x and IE/NT )
Costin
On Sat, 19 May 2001, Vincent Schonau wrote:
> On Fri, May 18, 2001 at 12:40:04PM -0700, Forrest R. Girouard wrote:
> >
> > It is my understanding that '8859_1' is an alias for a Java encoding
> > which maps to the 'ISO-8859-1' character set. The Java encoding and
> > the character set name are not always the same.
> >
> > Furthermore, while it's not readily apparent using 'ISO8859_1' for
> > the Java encoding is far preferable to using '8859_1' (or anything
> > else) under Java 2.
> >
> > Look at the private getBTCConverter() method in the String.java source
> > and note the use of the following:
> >
> > !encoding.equals(btc.getCharacterEncoding())
> >
> > The ByteToCharConverter instance for ISO-8859-1 always returns 'ISO8859_1'
> > for the getCharacterEncoding() method and this means that while other
> > names may work the ThreadLocal caching will be subverted. Since the
> > ByteToCharConverter.getConverter() method involves synchronization it
> > is not a good thing to subvert the ThreadLocal cache.
>
> Thanks for pointing this out. AFAICS, the use of 'iso-8859-1' instead of
> '8859_1' (my patch) does not make this situation any better or worse in the
> tomcat code. <g>
>
> The tomcat 3.x code doesn't look like it takes this into account at all. I
> wonder if looking up the Java Encoding name associated with the encoding
> name supplied by user-agents etc. is an optimisation worth making. I'll look
> into that.
>
>
>
> Vince.
>