Pierre Delisle wrote:
> 
> Trying to close a few Jasper bugs before the holiday break.
> I'd appreciate at least another pair of eyes to review what I believe
> should be done on that one...  -- Pierre
> 
> -----
> Bug #55
> 
>    -----
>    Synopsis:
>      Default for included files is 8859_1, with no option to set otherwise.
> 
>    Report Description:
>      The default for reading an included file is ISO_8859_1. We can,
>      of course, set pageConent to read UTF-8 (which is what we need it
>      to be to support international
>     code). Unfortunately, when there are two or more levels of
>     encoding (or the pageContent type ins't set), the encoding that
>     the JspReader gets set to a hard-coded
>      "ISO_8859_1", and doesn't allow this to be set to anything else
>      via the runtime system properties. In:
>      org.apache.jasper.compiler.JspReader JspReader.java line
>      158, encoding ALWAYS defaults to 8859_1, and the file.encoding,
>      when set from the System properties. This is an easy fix, to set
>      encoding to: encoding =
>      System.getPropert("file.encoding","8859_1") ; The result,
>      typically, is that the file will flake out and convert all of the
>      non-UTF-8 characters to US-ASCII, @%, etc.
>      -----
> 
> I'm not sure I fully understand what's described there,
> so here is what I believe should be done.
> 
> The "encoding" for a JSP file is currently handled as follows:
> 
> 1. In Compiler.java, we create a JspReader for the top-level
>    ("including") jsp file using the 8859_1 encoding.
> 
> 2. Using that JspReader, we check if there is a page directive
>    with 'contentType' specified. If there is, then
>    a new JspReader for the page is created with the encoding set to the
>    "charset" specified in the contentType value of the page
>    directive; otherwise we stick with the default 8859_1 encoding.
> 
> 3. When a page is included, JspReader.pushFile() is called,
>    and the encoding passed as argument appears to always
>    be null (since no encoding attribute can be specified in
>    the "include" directive, reading 'encoding' off of the
>    attributes appears to be a bug in JspParseEventListener).
>    Because it is null, it always defaults to 8859_1.
> 
> If I understand well the intent of the bug report, we'd need the
> following modifications:
> 
> - In step 2, if contentType is not specified in the "including" page,
>   set the encoding to be:
> 
>      encoding = System.getProperty("file.encoding", "8859_1");
> 
>   This means that the default encoding of all JSP files at a site could
>   be defined globally using system property "file.encoding".
>   I don't think this is spec-compliant, and would be reluctant
>   to make that change.

I agree that using "file.encoding" as the ultimate default is not
spec compliant. I suggest you stick to the current behavior, with
"8859_1" if contentType doesn't specify a charset.

> - In step 3, use the encoding of the "including" page.

Sounds right to me.

>   This would fix what I believe is a bug in the current implementation.
> 
> Comments?

What about the javac encoding? I believe it's currently hardcoded
as "UTF8" (in Compiler at least). I'm not sure what it should be
in case different included pages specify different charsets ...

Hans
-- 
Hans Bergsten           [EMAIL PROTECTED]
Gefion Software         http://www.gefionsoftware.com
Author of JavaServer Pages (O'Reilly), http://TheJSPBook.com

Reply via email to