On Tue, 14 Aug 2001, Justin Erenkrantz wrote:

> mod_jk chops off the r->unparsed_uri itself without copying.  Negative
> points for style.  =-)  -- justin

That's true. However I'm not sure what else could we do - copy it once
again to another buffer where we chop it ? It's not very much going on
with the unparsed uri.

If you strictly follow the spec,  mod_rewrite is out of question - and
same for most other apache modules that alter the request.
Since all of them are working on the URI, the result is just something
that has no "unmodified" orginal.

However, if you read the URI spec, 2 URIs are equivalent if the octets are
identical - it doesn't matter how you encode it. Re-escaping the URI has
the extra benefit of getting a "canonical" escaping, which is also a bit
safer ( hey, we also get the first class security checks apache is doing
on the parsed uris ).

Another note - my understanding of the HTTP specification is that proxies
_are_ allowed to escape/unescape the URI - as long as the result is
"equivalent". So if a proxy is used, the "original URI the user typed"
will be lost. Same for the browsers - what the user types is very
different from what is sent ( at least in Opera ).

Of course, we can define "unparsed URI" to be whatver the servlet
container receives. This may be different from the original request ( if
it goes through proxies ).

Now the question is - where does the container starts :=). I think there
are plenty of reasons to treat the Apache as not beeing part of the
container - after all it follows completely different rules on mappings
( extension mapps can have path info), and in almost everything.

In fact, I'm not sure all web servers even allow access to the original
unescaped URI. Some IIS or NES expert should let us know.

So my take is that the container should indeed return the original URI -
that the container received. What apache does ( like rewriting, or
"canonicalise" the URI ) is separate.

Otherwise - the rewriting itself would violate the servlet spec, since it
would alter the URI.

Again - I would bet that at least one of IIS and NES doesn't allow access
to "original" URI anyway.

Costin

P.S. Quite a long mail for something as simple as 1-2-3, I spend quite a
lot of time with this issue - Larry may remember how long the bug was
open and with my name on it.





Reply via email to