Mark Thomas wrote:

This is obviously a bigger mess than I first thought. As I see it, the following options exist for resolving bug 22666.

1. WONTFIX - On the basis that there is too much uncertainty to do anything sensible and that any changes made might break interoperability as per Remy's point 3 below.

2. FIX - Patch the parameter class (as per Remy's point 2 below) on the grounds that the JSP spec states "The World Wide Web Consortium (http://www.w3.org/) is a definitive source of HTTP related information affecting this specification and its implementations." and the w3c view (http://www.w3.org/International/O-URL-code.html) is that URI encoding should always be based on UTF-8. However, this is still likely to break things (back to Remy's point 3).

3. FIX - Add a configuration option that enables w3c compliant URI decoding and patch the parameter and any other relevant classes to support this option. I am not 100% sure where the best place to do this would be. I am leaning towards adding it to the context as an optional parameter with a default state of disabled.

There are several bugs in bugzilla that look as if they are on similar lines and on that basis my own view is that option 3 is way to go. Before I start coding, I would be grateful for some feedback/guidance on my planned approach.

I'll vote almost 2 ;-) No client I know of is always cosistently using UTF8 to encode the URL, but however, I'm not sure clients are using the encoding of the entity body to encode the URL.
Proper character decoding of the decoded (it means %xx decoded here) URL is already done (see CoyoteAdapter.convertURI), and there's a connector.getURIEncoding() which is available to indicate what encoding is to be used for the URL. Note: The default is US-ASCII (because something else doesn't work), but you can be compliant with the W3C and use UTF8 :) For more flexibility, we can use a new connector field for that (let's call it connector.getQueryStringEncoding()), or use connector.getURIEncoding(). This would be passed to the Parameters class and used exclusively for the query string decoding (the POSTed stuff won't use it, obviously). I want (I have to insist ;-) ) the default be US-ASCII (so the feature will work in the real world) with a quick and dirty B2C conversion in that particular case (like CoyoteAdapter.convertURI).


Overall, this looks the most reasonable and flexible.

Note: If you want to code it, you'd better do it really fast ;-)

Remy



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to