UNSUBSCRIBE On Thu, Jan 23, 2025 at 9:00 AM James Matlik <james.mat...@gmail.com> wrote:
> It works! > > I've been able to test with a customer name of: ÀËÌÑàëíñøü / \ Ѐӿ 中さ 😀 > customer > This covers > - Latin-1 Supplement characters are 1 byte: ÀËÌÑàëíñøü > - The / and \ slash characters are ASCII, but are encoded due to special > meaning in a URL > - Cyrillic characters are 2 bytes: Ѐӿ > - Chinese and Japanese characters are 3 bytes: 中さ > - emoji characters are 4 bytes: 😀 > > A client can request a URL path with the following: > > "/customers/customer/%C3%80%C3%8B%C3%8C%C3%91%C3%A0%C3%AB%C3%AD%C3%B1%C3%B8%C3%BC%20%2F%20%5C%20%D0%80%D3%BF%20%E4%B8%AD%E3%81%95%20%F0%9F%98%80%20customer" > > Then Tomcat processes the URL to the following and passes it into the > servlet. > "/customers/customer/ÀËÌÑàëíñøü %2F %5C Ѐӿ 中さ 😀 customer" > > So we are now able to save this test customer to the database and read it > back out again. I tried this with a few other paths that are a little > structurally different (with/without query string, escaped string in a > middle path segment instead of just the end) and it all worked as desired. > > Thank you so much for working on this! > -James > > On Thu, Jan 23, 2025 at 8:20 AM Mark Thomas <ma...@apache.org> wrote: > > > James, > > > > I've added attributes (encodedReverseSolidusHandling and > > encodedSolidusHandling) to the Context to provide control of how the > > RequestDispatcher paths are processed. > > > > Snapshots built after 12.00 UTC today should include the new attributes. > > > > As I type, Tomcat 12 is available, Tomcat 11 is building and Tomcat 10 > > and Tomcat 9 are in the queue. All should be complete in a couple of > hours. > > > > Mark > > > > > > On 22/01/2025 09:30, Mark Thomas wrote: > > > This is going to be fun. > > > > > > The RequestDispatcher processing currently does not take account of > > > encodedSolidusHandling or encodedReverseSolidusHandling. > > > RequestDsipatcher URLs are processed using the defaults. > > > > > > I have a test case that demonstrates various forms of this problem. I > am > > > currently working on a fix. > > > > > > Mark > > > > > > > > > On 21/01/2025 23:10, James Matlik wrote: > > >> It got further, but ran into another security check within the > > >> org.apache.catalina.core.ApplicationContext. I'm using the 10.1.35- > > >> SNAPSHOT > > >> version as that is closest to my current version, and easiest to test > > >> as a > > >> result. This is the block of code having trouble now (linking into > > >> 10.1.34): > > >> https://github.com/apache/tomcat/blob/10.1.34/java/org/apache/ > > >> catalina/core/ApplicationContext.java#L386-L396 > > >> > > >> This code: > > >> 1. takes the URL (e.g. /customers/customer/A%2FB%5CC) > > >> 2. decodes it (e.g. /customers/customer/A/B\C) > > >> 3. normalizes the decoded url with the replaceBackSlash parameter > > >> hardcoded > > >> to true, thus forcing the decoded '\' character to become a '/' > > character > > >> (e.g. /customers/customer/A/B/C) > > >> 4. then it compares the decoded URL to the normalized URL, and if they > > >> are > > >> not equal, an IllegalArgumentException is created and a 500 response > is > > >> returned. > > >> > > >> It seems that this area of the code would need to be aware of the > > >> encodedReverseSolidusHandling configuration as well. > > >> > > >> Thank you, > > >> James > > >> > > >> On Tue, Jan 21, 2025 at 1:20 PM Mark Thomas <ma...@apache.org> wrote: > > >> > > >>> On 21/01/2025 14:15, James Matlik wrote: > > >>>> Hello Mark, > > >>>> > > >>>> Yes, I would be available to test a snapshot if this gets > implemented. > > >>> > > >>> Latest (as I type this) Tomcat 12 build with the new feature: > > >>> > > >>> https://repository.apache.org/content/groups/snapshots/org/apache/ > > >>> tomcat/tomcat/12.0.0-M1-SNAPSHOT/tomcat-12.0.0- > > >>> M1-20250121.171115-315.tar.gz > > >>> > > >>> The snapshots for the 11.0.x, 10.1.x and 9.0.x are still building. > > >>> Generally, you'll find them under: > > >>> > > >>> > > >>> https://repository.apache.org/content/groups/snapshots/org/apache/ > > >>> tomcat/tomcat/ > > >>> > > >>> Anything built after 1700 UTC today should have the fix. Make sure > you > > >>> look for the snapshot for the current dev version for each release > > >>> branch. > > >>> > > >>> The Connector attribute is called encodedReverseSolidusHandling > > >>> > > >>> Let the list know how you get on. > > >>> > > >>> Mark > > >>> > > >>> > > >>>> > > >>>> -James > > >>>> > > >>>> On Tue, Jan 21, 2025 at 8:17 AM Mark Thomas <ma...@apache.org> > wrote: > > >>>> > > >>>>> On 18/01/2025 16:18, James Matlik wrote: > > >>>>>> I agree with everything you have said. As the config options stand > > >>> today, > > >>>>>> the allowBackslash seems to implement part of > encodeSolidusHandling. > > >>>>>> > > >>>>>> While encodeSolidusHandling supports: > > >>>>>> * REJECT - Return 400 on encoded / > > >>>>>> * DECODE - Decodes the / > > >>>>>> * PASS_THROUGH - Passes the encoded / as is > > >>>>>> > > >>>>>> The allowBackslash supports two mostly analogous settings: > > >>>>>> * false - REJECT > > >>>>>> * true - DECODE, but it Decodes as / instead of \ > > >>>>> > > >>>>> Not quite. Decoding is separate. > > >>>>> > > >>>>> Currently, %5C always gets decoded to \. allowBackslash then > > >>>>> determines > > >>>>> whether any \ (originally encoded or not) characters in the URL > > >>>>> will be > > >>>>> permitted and converted to / or the URL rejected. > > >>>>> > > >>>>> The decoding and conversion are partially independent. For example, > > >>>>> while it may be required that an encoded sequence %5C passed > > >>>>> through to > > >>>>> the application, whether an unencoded \ is rejected or converted > > >>>>> to / is > > >>>>> an independent choice. > > >>>>> > > >>>>> From a security point of view, it is necessary to always convert > > >>>>> \ to > > >>> / > > >>>>> - at least on Windows anyway - else all sorts of nasty security > > things > > >>>>> could happen such as bypassing security constraints. > > >>>>> > > >>>>> We should be able to fix this for the next round of releases (early > > >>>>> Feb). I am currently looking at whether a new option to effectively > > >>>>> replace allowBackslash is sufficient or whether an new option is > > >>>>> needed > > >>>>> in addition to allowBackslash. > > >>>>> > > >>>>> My current thinking is that we should add > > encodeReverseSolidusHandling > > >>>>> with the same options as encodeSolidusHandling and leave the > > >>>>> convert/reject decision for unencoded \ to allowBackslash. > > >>>>> > > >>>>>> IMO, an ideal implementation would introduce > > >>> encodeReverseSolidusHandling > > >>>>>> to behave the same as encodeSolidusHandling, but maybe add a > > >>>>>> DECODE_AS_SOLIDUS. Then the allowBackslash setter would set the > > >>>>>> encodeReverseSolidusHandling variable to DECODE_AS_SOLIDUS or > REJECT > > >>>>> (true > > >>>>>> vs false) for backward compatibility, but should probably be > > >>> deprecated. > > >>>>>> The code that normalizes the URL can remove any special handling > of > > >>> the \ > > >>>>>> character because all enforcement can be applied within UDecoder > > >>>>>> as is > > >>>>> done > > >>>>>> with the / character today. > > >>>>> > > >>>>> I don't think DECODE_AS_SOLIDUS is necessary although it could > > >>>>> always be > > >>>>> added as an additional option later if necessary. > > >>>>> > > >>>>>> This would make the handling of \ more consistent with /, and I > > >>>>>> imagine > > >>>>> the > > >>>>>> security concerns between the two remain about equal. Though, like > > >>>>>> you > > >>>>>> said, others on the team have already thought deeply about this, > so > > I > > >>> may > > >>>>>> be missing something important. > > >>>>>> > > >>>>>> If I were to pursue getting a change like this made to Tomcat, > where > > >>>>> should > > >>>>>> I start? > > >>>>> > > >>>>> This is the way to start the process. A discussion on the users > list > > >>>>> that clearly explains the problem and what a solution might look > > like. > > >>>>> > > >>>>>> I know this would be a slow process. > > >>>>> > > >>>>> There is a good reason for the feature, the implementation looks to > > be > > >>>>> backwards compatible and not too invasive, there isn't simple > > >>>>> extension > > >>>>> point you can use to implement and one or more committers are > > >>>>> interested > > >>>>> in solving the problem then there is a good chance it will be in > the > > >>>>> next round of releases (early Feb). > > >>>>> > > >>>>>> Or, if I'm better off working around the core functionality, would > > >>>>>> you > > >>>>> have > > >>>>>> any suggestions on how? I see the UDecoder recently changed to > > >>>>>> support > > >>>>>> encoded % characters. I considered using a double encoded \ hack > to > > >>>>>> effectively pass through, but that won't work now. Plus, I wasn't > > >>>>>> able > > >>> to > > >>>>>> figure out how to add custom logic before the URL decode and > > >>>>>> normalize > > >>> to > > >>>>>> add my work around encoding on the server side, as doing so in the > > >>> client > > >>>>>> side isn't feasible. > > >>>>>> > > >>>>>> Ideally, I wouldn't need to maintain a custom build of Tomcat > > >>>>> indefinitely. > > >>>>> > > >>>>> There isn't an easy (or any) extension point to implement this. It > > >>>>> would > > >>>>> have to be a custom Tomcat build. > > >>>>> > > >>>>> Are you able to test some snapshot builds if this gets implemented? > > >>>>> > > >>>>> Mark > > >>>>> > > >>>>>> > > >>>>>> Thanks, James > > >>>>>> > > >>>>>> On Fri, Jan 17, 2025, 10:00 AM Christopher Schultz < > > >>>>>> ch...@christopherschultz.net> wrote: > > >>>>>> > > >>>>>>> James, > > >>>>>>> > > >>>>>>> On 1/17/25 8:04 AM, James Matlik wrote: > > >>>>>>>> When I'm talking about path parameters, it is in the context of > > how > > >>>>> Open > > >>>>>>>> API/Swagger defined them: > > >>>>>>>> > https://swagger.io/docs/specification/v3_0/describing-parameters/ > > >>>>>>> > > >>>>>>> Okay, that helps clear things up. In the URL specification > > >>>>>>> (inherited > > >>> by > > >>>>>>> HTTP) defines them differently: > > >>>>>>> > > >>>>>>> https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3 > > >>>>>>> > > >>>>>>> The term "path parameter" you are using is actually a "path > > segment" > > >>> in > > >>>>>>> URL/HTTP terms. > > >>>>>>> > > >>>>>>> The problem you are running into is that encoded slashes are > often > > >>> used > > >>>>>>> to attack servers by performing path-traversals or avoiding > > security > > >>>>>>> constraints. Let's say you are on Windows and there is a security > > >>>>>>> constraint that says "only admin users can access files in > > >>>>>>> D:\docroot\(ON\QC)". You have mapped URL-/ to > physical-D:\docroot. > > >>>>>>> (Admittedly this is very contrived). > > >>>>>>> > > >>>>>>> The attacker says "well, %29 is a backslash, so lemmie just check > > to > > >>> see > > >>>>>>> if I can slip one past the security system" and requests > > >>>>>>> http://localhost:8080/group/%28ON%5CQC%29%20LOCAL. If the server > > is > > >>>>>>> fooled, the attacker gets to request that resource in that > > >>>>>>> directory. > > >>>>>>> > > >>>>>>> My guess is that Tomcat's "URL cleaner" is seeing that \, > > attempting > > >>> to > > >>>>>>> "sanitize" it, and converting it to a / because that's the local > > >>>>>>> filesystem path separator. This makes security constraints much > > more > > >>>>>>> difficult to bypass. > > >>>>>>> > > >>>>>>> My naïve reading of your desire to have > > >>>>>>> encodeSolidusHandling=PASS_THROUGH behave the same for / and \ > > seems > > >>>>>>> reasonable if a misnomer. I say "naïve" because others on this > team > > >>> have > > >>>>>>> spent countless hours of their lives mulling these things over > and > > >>>>>>> getting the logic "just right". > > >>>>>>> > > >>>>>>> I say it's a misnomer because "solidus" means literally / and not > > \. > > >>>>>>> encodeReverseSolidusHandling? encodeAntiSolidusHandling? > > >>>>>>> > > >>>>>>> The encodeSolidusHandling configuration setting is there so we > can > > >>> have > > >>>>>>> a very secure default-configuration that prevents attacks on > naive > > >>>>>>> applications. In your case, you have an application that > > >>>>>>> (presumably) > > >>>>>>> knows exactly what it is doing, and therefore you have the option > > to > > >>>>>>> pass-through those solidus...es. Solidii? Whatever. > > >>>>>>> > > >>>>>>> I think that actually having a separate configuration setting for > > >>>>>>> antisolidii handling makes more sense than having that existing > > >>> setting > > >>>>>>> perform double-duty, since each one of those you allow to pass- > > >>>>>>> through > > >>>>>>> opens the door slightly wider to an attacker. > > >>>>>>> > > >>>>>>> -chris > > >>>>>>> > > >>>>>>> > > >>>>>>> > > --------------------------------------------------------------------- > > >>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > >>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > > >>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>>> > --------------------------------------------------------------------- > > >>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > >>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > > >>>>> > > >>>>> > > >>>> > > >>> > > >>> > > >>> --------------------------------------------------------------------- > > >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > >>> For additional commands, e-mail: users-h...@tomcat.apache.org > > >>> > > >>> > > >> > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > > For additional commands, e-mail: users-h...@tomcat.apache.org > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > For additional commands, e-mail: users-h...@tomcat.apache.org > > > > >