@All, thanks everybody for responding so far. I apologize for not originally including the version of Tomcat we're using (6.0.18). It was an oversight on my part in my hurried effort to write the email -- totally just blitzed that I should include that (my first post to the Tomcat list).

I realize that I accidentally posted an incorrect example. The question posted by Christopher S. using the URLEncoded value of %3B is the correct encoding. I shouldn't have pasted in a plain text ";".

This ";" is an edge case for us. This URL encoding is part of a SES / Friendly URL implementation of a framework I'm a contributor for and therefore this was logged as defect report by a framework user since originally we were not URL encoding special characters (they were including a short "error" message in the URL proper). We have no clue what people are generating and how they using as URLs -- just trying to get an idea of why this wasn't working.

@Christopher S., Sadly Tomcat still truncates the path info even when encoding the the ";" as %3B. We've actually resorted to using "unicode" like representation (changing "U+03B" to U_03B so to not use a +) and transforming back when processing the incoming requests.

@Chuck, thanks for your snarky response. I'll leave it at that.

@Bill, thanks for mentioning the specific RFC referencing the encoding of ";". That's what the team here figured out as well and it's nice to have an independent person verify our assumptions after reading the RFC sections. Is it safe to assume that Tomcat code base assumes that anything after the ";" has to be the jsessionid? That's our assumption at the moment. Yes, there is a work around by using the request.getRequestURI() instead as that has the whole URI and manually remove the "absolute" path to get the complete path info.

Since our framework is deployed on several different CFML servlets -- their implementation to get at the original http request wrapper differs a bit (three different vendors). We'll probably stick to use the poor man's encoding using a modified unicode representation of ";" in the end. Another solution is to write a filter and use the getRequestURI() and replace the bad path info in the request with the full length version.

Thanks again,
.Peter

Bill Barker said the following on 12/23/-28158 01:59 PM:
Shouldn't that be /index.cfm/somePathInfo&amp%3BwithMoreInfo/

?

If you try the above URL, does it work?

java.net.URLEncoder will encode ";" as "%3B".

See the URL Specification (RFC 1738,
http://www.ietf.org/rfc/rfc1738.txt), section 2.2 "URL Character
Encoding Issues
":

"
Many URL schemes reserve certain characters for a special meaning:
their appearance in the scheme-specific part of the URL has a
designated semantics. If the character corresponding to an octet is
reserved in a scheme, the octet must be encoded.  The characters ";",
"/", "?", ":", "@", "=" and "&" are the characters which may be
reserved for special meaning within a scheme. No other characters may
be reserved within a scheme.
"

The HTTP specification does not specifically say that semi-colons are
reserved, but perhaps the common interpretation of the URL spec is such
that semi-colons should always be encoded.


Actually it does, just by reference. Section 3.2.1 of RFC 2616 defers to RFC 2396. And section 3.3 of that RFC gives a special meaning to ';'. Tomcat doesn't handle this correctly according the the RFC, but no developer/contributor has had enough of an itch to fix it. But I doubt that fixing it would help the OP much.

The fully compliant Tomcat would have to remove anything after a ';' (including the ';') up until the next '/' (if any) for the purpose of mapping the request. It should then re-include them in the various parts of the request URI (except for ";jsessionid"). So it's a lot of work to implement an archane feature that has plenty of work arounds.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkpozVsACgkQ9CaO5/Lv0PBnLwCfXFSSIDAnRR0BurRKeS0ub/v9
3UYAoJ1gp5oIqnJw2WgHx9LdVzqqAOAI
=rpT0
-----END PGP SIGNATURE-----





Reply via email to