Pierre Delisle wrote:
>
> Trying to close a few Jasper bugs before the holiday break.
> I'd appreciate at least another pair of eyes to review what I believe
> should be done on that one... -- Pierre
>
> -----
> Bug #55
>
> -----
> Synopsis:
> Default for included files is 8859_1, with no option to set otherwise.
>
> Report Description:
> The default for reading an included file is ISO_8859_1. We can,
> of course, set pageConent to read UTF-8 (which is what we need it
> to be to support international
> code). Unfortunately, when there are two or more levels of
> encoding (or the pageContent type ins't set), the encoding that
> the JspReader gets set to a hard-coded
> "ISO_8859_1", and doesn't allow this to be set to anything else
> via the runtime system properties. In:
> org.apache.jasper.compiler.JspReader JspReader.java line
> 158, encoding ALWAYS defaults to 8859_1, and the file.encoding,
> when set from the System properties. This is an easy fix, to set
> encoding to: encoding =
> System.getPropert("file.encoding","8859_1") ; The result,
> typically, is that the file will flake out and convert all of the
> non-UTF-8 characters to US-ASCII, @%, etc.
> -----
>
> I'm not sure I fully understand what's described there,
> so here is what I believe should be done.
>
> The "encoding" for a JSP file is currently handled as follows:
>
> 1. In Compiler.java, we create a JspReader for the top-level
> ("including") jsp file using the 8859_1 encoding.
>
> 2. Using that JspReader, we check if there is a page directive
> with 'contentType' specified. If there is, then
> a new JspReader for the page is created with the encoding set to the
> "charset" specified in the contentType value of the page
> directive; otherwise we stick with the default 8859_1 encoding.
>
> 3. When a page is included, JspReader.pushFile() is called,
> and the encoding passed as argument appears to always
> be null (since no encoding attribute can be specified in
> the "include" directive, reading 'encoding' off of the
> attributes appears to be a bug in JspParseEventListener).
> Because it is null, it always defaults to 8859_1.
>
> If I understand well the intent of the bug report, we'd need the
> following modifications:
>
> - In step 2, if contentType is not specified in the "including" page,
> set the encoding to be:
>
> encoding = System.getProperty("file.encoding", "8859_1");
>
> This means that the default encoding of all JSP files at a site could
> be defined globally using system property "file.encoding".
> I don't think this is spec-compliant, and would be reluctant
> to make that change.
I agree that using "file.encoding" as the ultimate default is not
spec compliant. I suggest you stick to the current behavior, with
"8859_1" if contentType doesn't specify a charset.
> - In step 3, use the encoding of the "including" page.
Sounds right to me.
> This would fix what I believe is a bug in the current implementation.
>
> Comments?
What about the javac encoding? I believe it's currently hardcoded
as "UTF8" (in Compiler at least). I'm not sure what it should be
in case different included pages specify different charsets ...
Hans
--
Hans Bergsten [EMAIL PROTECTED]
Gefion Software http://www.gefionsoftware.com
Author of JavaServer Pages (O'Reilly), http://TheJSPBook.com