Hans Bergsten wrote:

Larry Isaacs wrote:

Hans,

The behavior change is unrelated to the use of getParameter()
to search for "jsp_precompile".  Both Tomcat 3.x and Tomcat 4.x
were bit by this log ago and Craig's fix was applied to both.
In Tomcat 4's case, it was prior to the 4.0 release.


Okay, I'm sure you're right that there may be more to it than
avoiding the getParameter() call in Jasper, but based on what
I've read, it seems to be part of the problem at least.

Assuming I have a good grip on the issue, I think it relates
to using UTF-8 to decode the path portion of the URL which
gets used to determine context, servlet mapping, etc.  Then
allowing setCharacterEncoding() to change the character encoding
for the query portion of the same URL.  The Servlet 2.3 and 2.4
specs both say setCharacterEncoding() applies to the request body
but don't mention it applying to the query portion of the URL.


Right, but since the servlet spec doesn't say anything about encoding
for the query portion, I think we have some room for a sensible
interpretation.

My concern is that with the new decoding behavior, apps that used to
work fine suddenly don't, and the reason seems to be that browsers
in fact ignore the RFC2718 recommendation that TC now enforces. I'm
all for compliance with all related specs, but in this case it's just
a recommendation and following it seems to do more harm than good.

I agree it's not as clean as you may want, but are there any real
problems with decoding the path portion using one charset and the
query string with another (i.e., the one from getCharacterEncoding()),
the way it used to be done?

I see you as a member of the expert group for the servlet spec. Did you make out those points during the review period ? If not, then you IMO have nothing to complain about, esp since Tomcat implements a far more reasonable and simpler behavior for the URL string handling.


The specification should have specified something along the lines of:
- The URL must be %xx encoded
- This decodes to bytes reprensenting UTF-8 characters
There's an IETF standard that, I think, states this in B&W. It is being ignored. Maybe this wouldn't be the case if very popular tech, such as servlets & JSPs, started mandating it ? This is simply a chiken & egg issue.


i18n issues with HTTP and srevlets have been known about for years, but unfortunately they still haven't been addressed properly.
Same with the request dispatcher + wrapping issues that I have pointed out months ago (and of course, were silently ignored).


To balance this a little, among the other big issues, I have to give credit for solving the welcome files in a satisfactory way, as well as filters with RDs. Filters now make the proprietary APIs provided by the container irrelevant for most tasks.

Rémy



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to