Hello again! I have run into something of an issue with tomcat 3.2.1 and 4.0 and urls/filenames that include funky/reserved characters ( the most common/troublesome being # so far ). To fix this I read rfc1738 ftp://ftp.isi.edu/in-notes/rfc1738.txt on the 'proper' way to handle these issues and found the answer in section 2.2 "URL Character Encoding Issues". After fixing my webapp to encode these characters in the proper way, I found that apache handled the urls and served out the resources correctly, but Tomcat 3.2.1 and 4.0 didn't. Actually 4.0 handled the most common character ( space or %20 ) but didn't handle other encoded characters, and 3.2.1 didn't handle any encoded characters ( if my memory serves me correcly ). After digging through the source I was able to put minor changes into a few files in 4.0 and one file in 3.2.1 that allowed both servers to handle these URLs correctly for all of the test cases I have that previously failed. I'll attach the patches for 4.0 now, I still want to go back over the 3.2 patches one more time and make sure I didn't miss anything. I attempted to handle all of the difficult situations that unencoding the URL might pose ( inserting control characters and/or trying to get to a file outside the appropriate area ), but there might be security/implementation issues that I missed. Also, it could very well be the case that these are not the right places to fix this particular problem, my apologies if I missed the mark ;) Thanks again and I should have the 3.2 patch worked over by tomorrow afternoon or so. David Weinrich
--- ResourcesBase.java Tue Dec 26 23:36:29 2000 +++ ResourcesBaseEd.java Tue Dec 26 23:37:27 2000 @@ -961,9 +961,32 @@ * @param path Path to be normalized */ protected String normalize(String path) { + String normalized = path; + + // Resolve encoded characters in the normalized path, + // which also handles encoded spaces so we can skip that later. + // Placed at the beginning of the chain so that encoded + // bad stuff(tm) can be caught by the later checks + while (true) { + int index = normalized.indexOf("%"); + if (index < 0) + break; + char replaceChar = + (char) ( Integer.parseInt( + normalized.substring( index + 1, index + 3 ), 16 ) ); + // check for control characters ( values 00-1f and 7f-9f), + // return null if present. See: + // http://www.unicode.org/charts/PDF/U0000.pdf + // http://www.unicode.org/charts/PDF/U0080.pdf + if ( Character.isISOControl( replaceChar ) ) { + return null; + } + normalized = normalized.substring(0, index) + + replaceChar + + normalized.substring(index + 3); + } // Normalize the slashes and add leading slash if necessary - String normalized = path; if (normalized.indexOf('\\') >= 0) normalized = normalized.replace('\\', '/'); if (!normalized.startsWith("/")) @@ -977,15 +1000,6 @@ normalized = normalized.substring(0, index) + normalized.substring(index + 1); } - - // Resolve occurrences of "%20" in the normalized path - while (true) { - int index = normalized.indexOf("%20"); - if (index < 0) - break; - normalized = normalized.substring(0, index) + " " + - normalized.substring(index + 3); - } // Resolve occurrences of "/./" in the normalized path while (true) {
--- DefaultServlet.java Tue Dec 26 23:42:09 2000 +++ DefaultServletEd.java Tue Dec 26 23:41:52 2000 @@ -729,9 +729,33 @@ * @param path Path to be normalized */ protected String normalize(String path) { + String normalized = path; + + // Resolve encoded characters in the normalized path, + // which also handles encoded spaces so we can skip that later. + // Placed at the beginning of the chain so that encoded + // bad stuff(tm) can be caught by the later checks + while (true) { + int index = normalized.indexOf("%"); + if (index < 0) + break; + char replaceChar = + (char) ( Integer.parseInt( + normalized.substring( index + 1, index + 3 ), 16 ) ); + // check for control characters ( values 00-1f and 7f-9f), + // return null if present. See: + // http://www.unicode.org/charts/PDF/U0000.pdf + // http://www.unicode.org/charts/PDF/U0080.pdf + if ( Character.isISOControl( replaceChar ) ) { + return null; + } + normalized = normalized.substring(0, index) + + replaceChar + + normalized.substring(index + 3); + } + // Normalize the slashes and add leading slash if necessary - String normalized = path; if (normalized.indexOf('\\') >= 0) normalized = normalized.replace('\\', '/'); if (!normalized.startsWith("/")) @@ -745,15 +769,6 @@ normalized = normalized.substring(0, index) + normalized.substring(index + 1); } - - // Resolve occurrences of "%20" in the normalized path - while (true) { - int index = normalized.indexOf("%20"); - if (index < 0) - break; - normalized = normalized.substring(0, index) + " " + - normalized.substring(index + 3); - } // Resolve occurrences of "/./" in the normalized path while (true) {