Hi Remy,

Okay, re-reviewed the original 22666 thread.  To complete this thread,
I'll assume the following from RFC2718 is our justification for the
new behavior:

      Unless there is some compelling reason for a
      particular scheme to do otherwise, translating character
      sequences into UTF-8 (RFC 2279) [3] and then subsequently
      using the %HH encoding for unsafe octets is recommended.

Tomcat will default to US-ASCII instead of UTF-8 so it won't break
too many existing webapps.  If there are other parts to this story,
I would be interested in learning of them.

I'm still concerned that this makes Tomcat less useful by creating
deployment problems for webapps that aren't technically broken.
However, these issues were covered in the prior e-mail thread
(http://www.mail-archive.com/[EMAIL PROTECTED]/msg46479.html),
so I'll drop the issue.  Thanks.

Cheers,
Larry


> -----Original Message-----
> From: Remy Maucherat [mailto:[EMAIL PROTECTED] 
> Sent: Friday, November 21, 2003 9:02 AM
> To: Tomcat Developers List
> Subject: Re: Justification for URIEncoding addition?
> 
> 
> Larry Isaacs wrote:
> > Okay, I'm trying to writing an e-mail to those here at SAS Institute
> > to explain the behavioral change in Tomcat 5.0.14 and Tomcat 4.1.29 
> > with respect to the URIEncoding issue, and how it impacts 
> our webapps.
> > At the moment, I'm still struggling a bit on the 
> justification for this
> > change.
> > 
> > True, the servlet spec only says that setCharacterEncoding() affects
> > the request body.  However, it seems to me to leave the 
> issue of character
> > encoding with respect to query parameters unaddressed.  
> After my scan of
> > RFCs (primarily 2396, 1808, and 1738), what can and can't 
> be used for
> > character encoding underneath URL encoding in the URI seems 
> unaddressed
> > as well.  If this is addressed somewhere, I would be 
> interested in knowing
> > where. 
> > 
> > I think the example provided in the 24557 bug, where an 
> HREF is generated
> > that includes a character encoded query, may not be that 
> unusual.  It
> > is something that occurs in our webapps here at SAS.  With 
> the latest
> > implementation, such a webapp that uses URLEncode(str) and 
> another that
> > uses URLEncode(str,"UTF-8") will need different values for 
> URIEncoding.
> > Thus, they can no longer be served under the same Host 
> using the same
> > connector/port.  I think this represents a serious loss of 
> functionality
> > from prior versions, without strong justification, from 
> what I can find.
> > Not that I have lots of experience writing webapp, but I 
> don't currently
> > see a good use case for the current behavior.  What am I missing?
> > 
> > I think it would be a benefit for the next releases of 
> Tomcat 4.1.x and
> > Tomcat 5.0.x to provide the ability to enable the old behavior, i.e.
> > the query parameter character encoding is assumed to be the 
> same as the
> > request character encoding.  Perhaps allowing 
> URIEncoding="request" to
> > enable this would be appropriate.  I would also recommend the old
> > behavior be the default as I think that would lead to fewer 
> complaints
> > on tomcat-user.
> > 
> > Sorry to raise this issue again so late.  Thoughts?
> > 
> > Cheers,
> > Larry
> > 
> > P.S. I haven't studied Bug 22666 yet.  I'll take a look and 
> see if there
> > is something to learn there.
> 
> I don't agree.
> 
> Remy
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to