Chris, from mobile (sorry for typos ;)
On Sat, Jan 25, 2025, 05:35 Christopher Schultz < ch...@christopherschultz.net> wrote: > Maxim, > > On 1/23/25 9:31 PM, Maxim Solodovnik wrote: > > from mobile (sorry for typos ;) > > > > > > On Thu, Jan 23, 2025, 23:00 James Matlik <james.mat...@gmail.com> wrote: > > > >> It works! > >> > >> I've been able to test with a customer name of: ÀËÌÑàëíñøü / \ Ѐӿ 中さ 😀 > >> customer > >> This covers > >> - Latin-1 Supplement characters are 1 byte: ÀËÌÑàëíñøü > >> - The / and \ slash characters are ASCII, but are encoded due to > special > >> meaning in a URL > >> - Cyrillic characters are 2 bytes: Ѐӿ > >> > > > > These two are not Cyrillic :)) > > Unicode says the lowercase version of Ѐ[1] is Cyrillic[2], and we are > ignorant westerners :) > > Also kha seems definitely to be Cyrillic[3]. > My bad :) These two are not in russian, ukrainian or bulgarian > > These are: Вау :)) > > LOL > More extraordinary are: ЩЖЦЪ I guess :) > -chris > > [1] https://www.compart.com/en/unicode/U+00C8 > [2] https://www.compart.com/en/unicode/U+0450 > [3] https://www.compart.com/en/unicode/U+04FF > > > > > - Chinese and Japanese characters are 3 bytes: 中さ > >> - emoji characters are 4 bytes: 😀 > >> > >> A client can request a URL path with the following: > >> > >> > "/customers/customer/%C3%80%C3%8B%C3%8C%C3%91%C3%A0%C3%AB%C3%AD%C3%B1%C3%B8%C3%BC%20%2F%20%5C%20%D0%80%D3%BF%20%E4%B8%AD%E3%81%95%20%F0%9F%98%80%20customer" > >> > >> Then Tomcat processes the URL to the following and passes it into the > >> servlet. > >> "/customers/customer/ÀËÌÑàëíñøü %2F %5C Ѐӿ 中さ 😀 customer" > >> > >> So we are now able to save this test customer to the database and read > it > >> back out again. I tried this with a few other paths that are a little > >> structurally different (with/without query string, escaped string in a > >> middle path segment instead of just the end) and it all worked as > desired. > >> > >> Thank you so much for working on this! > >> -James > >> > >> On Thu, Jan 23, 2025 at 8:20 AM Mark Thomas <ma...@apache.org> wrote: > >> > >>> James, > >>> > >>> I've added attributes (encodedReverseSolidusHandling and > >>> encodedSolidusHandling) to the Context to provide control of how the > >>> RequestDispatcher paths are processed. > >>> > >>> Snapshots built after 12.00 UTC today should include the new > attributes. > >>> > >>> As I type, Tomcat 12 is available, Tomcat 11 is building and Tomcat 10 > >>> and Tomcat 9 are in the queue. All should be complete in a couple of > >> hours. > >>> > >>> Mark > >>> > >>> > >>> On 22/01/2025 09:30, Mark Thomas wrote: > >>>> This is going to be fun. > >>>> > >>>> The RequestDispatcher processing currently does not take account of > >>>> encodedSolidusHandling or encodedReverseSolidusHandling. > >>>> RequestDsipatcher URLs are processed using the defaults. > >>>> > >>>> I have a test case that demonstrates various forms of this problem. I > >> am > >>>> currently working on a fix. > >>>> > >>>> Mark > >>>> > >>>> > >>>> On 21/01/2025 23:10, James Matlik wrote: > >>>>> It got further, but ran into another security check within the > >>>>> org.apache.catalina.core.ApplicationContext. I'm using the 10.1.35- > >>>>> SNAPSHOT > >>>>> version as that is closest to my current version, and easiest to test > >>>>> as a > >>>>> result. This is the block of code having trouble now (linking into > >>>>> 10.1.34): > >>>>> https://github.com/apache/tomcat/blob/10.1.34/java/org/apache/ > >>>>> catalina/core/ApplicationContext.java#L386-L396 > >>>>> > >>>>> This code: > >>>>> 1. takes the URL (e.g. /customers/customer/A%2FB%5CC) > >>>>> 2. decodes it (e.g. /customers/customer/A/B\C) > >>>>> 3. normalizes the decoded url with the replaceBackSlash parameter > >>>>> hardcoded > >>>>> to true, thus forcing the decoded '\' character to become a '/' > >>> character > >>>>> (e.g. /customers/customer/A/B/C) > >>>>> 4. then it compares the decoded URL to the normalized URL, and if > they > >>>>> are > >>>>> not equal, an IllegalArgumentException is created and a 500 response > >> is > >>>>> returned. > >>>>> > >>>>> It seems that this area of the code would need to be aware of the > >>>>> encodedReverseSolidusHandling configuration as well. > >>>>> > >>>>> Thank you, > >>>>> James > >>>>> > >>>>> On Tue, Jan 21, 2025 at 1:20 PM Mark Thomas <ma...@apache.org> > wrote: > >>>>> > >>>>>> On 21/01/2025 14:15, James Matlik wrote: > >>>>>>> Hello Mark, > >>>>>>> > >>>>>>> Yes, I would be available to test a snapshot if this gets > >> implemented. > >>>>>> > >>>>>> Latest (as I type this) Tomcat 12 build with the new feature: > >>>>>> > >>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/ > >>>>>> tomcat/tomcat/12.0.0-M1-SNAPSHOT/tomcat-12.0.0- > >>>>>> M1-20250121.171115-315.tar.gz > >>>>>> > >>>>>> The snapshots for the 11.0.x, 10.1.x and 9.0.x are still building. > >>>>>> Generally, you'll find them under: > >>>>>> > >>>>>> > >>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/ > >>>>>> tomcat/tomcat/ > >>>>>> > >>>>>> Anything built after 1700 UTC today should have the fix. Make sure > >> you > >>>>>> look for the snapshot for the current dev version for each release > >>>>>> branch. > >>>>>> > >>>>>> The Connector attribute is called encodedReverseSolidusHandling > >>>>>> > >>>>>> Let the list know how you get on. > >>>>>> > >>>>>> Mark > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> -James > >>>>>>> > >>>>>>> On Tue, Jan 21, 2025 at 8:17 AM Mark Thomas <ma...@apache.org> > >> wrote: > >>>>>>> > >>>>>>>> On 18/01/2025 16:18, James Matlik wrote: > >>>>>>>>> I agree with everything you have said. As the config options > stand > >>>>>> today, > >>>>>>>>> the allowBackslash seems to implement part of > >> encodeSolidusHandling. > >>>>>>>>> > >>>>>>>>> While encodeSolidusHandling supports: > >>>>>>>>> * REJECT - Return 400 on encoded / > >>>>>>>>> * DECODE - Decodes the / > >>>>>>>>> * PASS_THROUGH - Passes the encoded / as is > >>>>>>>>> > >>>>>>>>> The allowBackslash supports two mostly analogous settings: > >>>>>>>>> * false - REJECT > >>>>>>>>> * true - DECODE, but it Decodes as / instead of \ > >>>>>>>> > >>>>>>>> Not quite. Decoding is separate. > >>>>>>>> > >>>>>>>> Currently, %5C always gets decoded to \. allowBackslash then > >>>>>>>> determines > >>>>>>>> whether any \ (originally encoded or not) characters in the URL > >>>>>>>> will be > >>>>>>>> permitted and converted to / or the URL rejected. > >>>>>>>> > >>>>>>>> The decoding and conversion are partially independent. For > example, > >>>>>>>> while it may be required that an encoded sequence %5C passed > >>>>>>>> through to > >>>>>>>> the application, whether an unencoded \ is rejected or converted > >>>>>>>> to / is > >>>>>>>> an independent choice. > >>>>>>>> > >>>>>>>> From a security point of view, it is necessary to always > convert > >>>>>>>> \ to > >>>>>> / > >>>>>>>> - at least on Windows anyway - else all sorts of nasty security > >>> things > >>>>>>>> could happen such as bypassing security constraints. > >>>>>>>> > >>>>>>>> We should be able to fix this for the next round of releases > (early > >>>>>>>> Feb). I am currently looking at whether a new option to > effectively > >>>>>>>> replace allowBackslash is sufficient or whether an new option is > >>>>>>>> needed > >>>>>>>> in addition to allowBackslash. > >>>>>>>> > >>>>>>>> My current thinking is that we should add > >>> encodeReverseSolidusHandling > >>>>>>>> with the same options as encodeSolidusHandling and leave the > >>>>>>>> convert/reject decision for unencoded \ to allowBackslash. > >>>>>>>> > >>>>>>>>> IMO, an ideal implementation would introduce > >>>>>> encodeReverseSolidusHandling > >>>>>>>>> to behave the same as encodeSolidusHandling, but maybe add a > >>>>>>>>> DECODE_AS_SOLIDUS. Then the allowBackslash setter would set the > >>>>>>>>> encodeReverseSolidusHandling variable to DECODE_AS_SOLIDUS or > >> REJECT > >>>>>>>> (true > >>>>>>>>> vs false) for backward compatibility, but should probably be > >>>>>> deprecated. > >>>>>>>>> The code that normalizes the URL can remove any special handling > >> of > >>>>>> the \ > >>>>>>>>> character because all enforcement can be applied within UDecoder > >>>>>>>>> as is > >>>>>>>> done > >>>>>>>>> with the / character today. > >>>>>>>> > >>>>>>>> I don't think DECODE_AS_SOLIDUS is necessary although it could > >>>>>>>> always be > >>>>>>>> added as an additional option later if necessary. > >>>>>>>> > >>>>>>>>> This would make the handling of \ more consistent with /, and I > >>>>>>>>> imagine > >>>>>>>> the > >>>>>>>>> security concerns between the two remain about equal. Though, > like > >>>>>>>>> you > >>>>>>>>> said, others on the team have already thought deeply about this, > >> so > >>> I > >>>>>> may > >>>>>>>>> be missing something important. > >>>>>>>>> > >>>>>>>>> If I were to pursue getting a change like this made to Tomcat, > >> where > >>>>>>>> should > >>>>>>>>> I start? > >>>>>>>> > >>>>>>>> This is the way to start the process. A discussion on the users > >> list > >>>>>>>> that clearly explains the problem and what a solution might look > >>> like. > >>>>>>>> > >>>>>>>>> I know this would be a slow process. > >>>>>>>> > >>>>>>>> There is a good reason for the feature, the implementation looks > to > >>> be > >>>>>>>> backwards compatible and not too invasive, there isn't simple > >>>>>>>> extension > >>>>>>>> point you can use to implement and one or more committers are > >>>>>>>> interested > >>>>>>>> in solving the problem then there is a good chance it will be in > >> the > >>>>>>>> next round of releases (early Feb). > >>>>>>>> > >>>>>>>>> Or, if I'm better off working around the core functionality, > would > >>>>>>>>> you > >>>>>>>> have > >>>>>>>>> any suggestions on how? I see the UDecoder recently changed to > >>>>>>>>> support > >>>>>>>>> encoded % characters. I considered using a double encoded \ hack > >> to > >>>>>>>>> effectively pass through, but that won't work now. Plus, I wasn't > >>>>>>>>> able > >>>>>> to > >>>>>>>>> figure out how to add custom logic before the URL decode and > >>>>>>>>> normalize > >>>>>> to > >>>>>>>>> add my work around encoding on the server side, as doing so in > the > >>>>>> client > >>>>>>>>> side isn't feasible. > >>>>>>>>> > >>>>>>>>> Ideally, I wouldn't need to maintain a custom build of Tomcat > >>>>>>>> indefinitely. > >>>>>>>> > >>>>>>>> There isn't an easy (or any) extension point to implement this. It > >>>>>>>> would > >>>>>>>> have to be a custom Tomcat build. > >>>>>>>> > >>>>>>>> Are you able to test some snapshot builds if this gets > implemented? > >>>>>>>> > >>>>>>>> Mark > >>>>>>>> > >>>>>>>>> > >>>>>>>>> Thanks, James > >>>>>>>>> > >>>>>>>>> On Fri, Jan 17, 2025, 10:00 AM Christopher Schultz < > >>>>>>>>> ch...@christopherschultz.net> wrote: > >>>>>>>>> > >>>>>>>>>> James, > >>>>>>>>>> > >>>>>>>>>> On 1/17/25 8:04 AM, James Matlik wrote: > >>>>>>>>>>> When I'm talking about path parameters, it is in the context of > >>> how > >>>>>>>> Open > >>>>>>>>>>> API/Swagger defined them: > >>>>>>>>>>> > >> https://swagger.io/docs/specification/v3_0/describing-parameters/ > >>>>>>>>>> > >>>>>>>>>> Okay, that helps clear things up. In the URL specification > >>>>>>>>>> (inherited > >>>>>> by > >>>>>>>>>> HTTP) defines them differently: > >>>>>>>>>> > >>>>>>>>>> https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3 > >>>>>>>>>> > >>>>>>>>>> The term "path parameter" you are using is actually a "path > >>> segment" > >>>>>> in > >>>>>>>>>> URL/HTTP terms. > >>>>>>>>>> > >>>>>>>>>> The problem you are running into is that encoded slashes are > >> often > >>>>>> used > >>>>>>>>>> to attack servers by performing path-traversals or avoiding > >>> security > >>>>>>>>>> constraints. Let's say you are on Windows and there is a > security > >>>>>>>>>> constraint that says "only admin users can access files in > >>>>>>>>>> D:\docroot\(ON\QC)". You have mapped URL-/ to > >> physical-D:\docroot. > >>>>>>>>>> (Admittedly this is very contrived). > >>>>>>>>>> > >>>>>>>>>> The attacker says "well, %29 is a backslash, so lemmie just > check > >>> to > >>>>>> see > >>>>>>>>>> if I can slip one past the security system" and requests > >>>>>>>>>> http://localhost:8080/group/%28ON%5CQC%29%20LOCAL. If the > server > >>> is > >>>>>>>>>> fooled, the attacker gets to request that resource in that > >>>>>>>>>> directory. > >>>>>>>>>> > >>>>>>>>>> My guess is that Tomcat's "URL cleaner" is seeing that \, > >>> attempting > >>>>>> to > >>>>>>>>>> "sanitize" it, and converting it to a / because that's the local > >>>>>>>>>> filesystem path separator. This makes security constraints much > >>> more > >>>>>>>>>> difficult to bypass. > >>>>>>>>>> > >>>>>>>>>> My naïve reading of your desire to have > >>>>>>>>>> encodeSolidusHandling=PASS_THROUGH behave the same for / and \ > >>> seems > >>>>>>>>>> reasonable if a misnomer. I say "naïve" because others on this > >> team > >>>>>> have > >>>>>>>>>> spent countless hours of their lives mulling these things over > >> and > >>>>>>>>>> getting the logic "just right". > >>>>>>>>>> > >>>>>>>>>> I say it's a misnomer because "solidus" means literally / and > not > >>> \. > >>>>>>>>>> encodeReverseSolidusHandling? encodeAntiSolidusHandling? > >>>>>>>>>> > >>>>>>>>>> The encodeSolidusHandling configuration setting is there so we > >> can > >>>>>> have > >>>>>>>>>> a very secure default-configuration that prevents attacks on > >> naive > >>>>>>>>>> applications. In your case, you have an application that > >>>>>>>>>> (presumably) > >>>>>>>>>> knows exactly what it is doing, and therefore you have the > option > >>> to > >>>>>>>>>> pass-through those solidus...es. Solidii? Whatever. > >>>>>>>>>> > >>>>>>>>>> I think that actually having a separate configuration setting > for > >>>>>>>>>> antisolidii handling makes more sense than having that existing > >>>>>> setting > >>>>>>>>>> perform double-duty, since each one of those you allow to pass- > >>>>>>>>>> through > >>>>>>>>>> opens the door slightly wider to an attacker. > >>>>>>>>>> > >>>>>>>>>> -chris > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>> --------------------------------------------------------------------- > >>>>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >> --------------------------------------------------------------------- > >>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > --------------------------------------------------------------------- > >>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>> > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>> > >>> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >