>From below, I believe that either step two (allowing an input of system
properties) or step three (fixing the include bug) would solve the problem.
I am not sure why the spec doesn't allow system properties to be added to
the tomcat startup scripts (i.e., -Dfile.encoding="UTF-8" , etc.), but I
trust you.
Thanks and have some Happy Holidays,
Nathan
-----Original Message-----
From: Pierre Delisle
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: 12/21/00 3:42 PM
Subject: Bug #55: Default for included files is 8859_1, with no option to
set otherwise
Trying to close a few Jasper bugs before the holiday break.
I'd appreciate at least another pair of eyes to review what I believe
should be done on that one... -- Pierre
-----
Bug #55
-----
Synopsis:
Default for included files is 8859_1, with no option to set
otherwise.
Report Description:
The default for reading an included file is ISO_8859_1. We can,
of course, set pageConent to read UTF-8 (which is what we need it
to be to support international
code). Unfortunately, when there are two or more levels of
encoding (or the pageContent type ins't set), the encoding that
the JspReader gets set to a hard-coded
"ISO_8859_1", and doesn't allow this to be set to anything else
via the runtime system properties. In:
org.apache.jasper.compiler.JspReader JspReader.java line
158, encoding ALWAYS defaults to 8859_1, and the file.encoding,
when set from the System properties. This is an easy fix, to set
encoding to: encoding =
System.getPropert("file.encoding","8859_1") ; The result,
typically, is that the file will flake out and convert all of the
non-UTF-8 characters to US-ASCII, @%, etc.
-----
I'm not sure I fully understand what's described there,
so here is what I believe should be done.
The "encoding" for a JSP file is currently handled as follows:
1. In Compiler.java, we create a JspReader for the top-level
("including") jsp file using the 8859_1 encoding.
2. Using that JspReader, we check if there is a page directive
with 'contentType' specified. If there is, then
a new JspReader for the page is created with the encoding set to the
"charset" specified in the contentType value of the page
directive; otherwise we stick with the default 8859_1 encoding.
3. When a page is included, JspReader.pushFile() is called,
and the encoding passed as argument appears to always
be null (since no encoding attribute can be specified in
the "include" directive, reading 'encoding' off of the
attributes appears to be a bug in JspParseEventListener).
Because it is null, it always defaults to 8859_1.
If I understand well the intent of the bug report, we'd need the
following modifications:
- In step 2, if contentType is not specified in the "including" page,
set the encoding to be:
encoding = System.getProperty("file.encoding", "8859_1");
This means that the default encoding of all JSP files at a site could
be defined globally using system property "file.encoding".
I don't think this is spec-compliant, and would be reluctant
to make that change.
- In step 3, use the encoding of the "including" page.
This would fix what I believe is a bug in the current implementation.
Comments?
-- Pierre