Remy, Craig,
Yes, you're right. I read the specs and apparently the TC way of doing
things is precisely the way it's written in the standard. However, that
still doesn't fix my problem except if I want to carry along my hacked
version forever.
Here's what I'm trying to achieve: I currently have Tomcat proxy requests
to underlying applications. When proxying applets however, I'm running into
trouble since I need to pass parameters to the proxy from the URI which in
this case is embedded in an <APPLET> tag and gets cut at the question mark
by the browser unless it's escaped. A properly behaving Tomcat will not be
able to find the right servlet.
So, is there any way to intercept the first call to the URI parser,
determine whether this is one of my previously encoded URIs and replace the
escaped character if it is?
Klaus
At 10:55 AM 4/13/01 -0700, you wrote:
> > On Fri, 13 Apr 2001, Klaus Sonnenleiter wrote:
> >
> > > Craig,
> > >
> > > I looked at HttpRequestImpl. Would it be safe to manipulate the URI in a
> > > call to setRequestURI before it sets the instance variable requestURI?
>It
> > > seems like this gets called the moment a request is made - this way, the
> > > encoded characters could be transformed to their unencoded equivalents
> > > before the parameter list is parsed and the classloader gets called.
> > >
> > > Klaus
> > >
> >
> > The key thing to remember is a spec requirement that
> > request.getRequestURI() must return the original request URI *without*
> > decoding. The values returned by request.getServletPath() and
> > request.getPathInfo(), on the other hand, are decoded first. Therefore,
> > if you manipulate the request URI value in setRequestURI(), we'd need to
> > make sure that we save an unmanipulated version somewhere as well.
> >
> > The deeper issue, though, is the portability of what you are
> > proposing (across servlet containers) would be. As I understand it, you
> > would like the %3f character to be interpreted as a "?" character so that
> > the stuff after it is understood as part of the query string. That seems
> > (to me) a questionable practice -- the reason you would use a %3f encoding
> > in the first place is so that you could treat a question mark as a regular
> > data character, instead of being a significant delimiter. If you decode
> > first and then find that the "?" is significant, how would you ever
> > include a question mark as part of the data value for a query string
> > parameter (for example)?
> >
> > NOTE: There also needs to be a little more work in this area with respect
> > to path parameters (;xxx stuff, which is how the session id is
> > transmitted). This is being discussed in the expert group, and will
> > probably require some minor changes in this area of Tomcat 4.
>
>'?' shouldn't be encoded in the first place as it's a reserved character
>(just like you should never encode '/' in the path). If it's encoded, I
>don't think it should be interpreted as the delimiter for the query section
>of the URL.
>So IMO the current TC behavior is the right one.
>
>The RFC for URIs is http://www.ietf.org/rfc/rfc2396.txt
>
>Remy