Chris,

from mobile (sorry for typos ;)


On Sat, Jan 25, 2025, 05:35 Christopher Schultz <
ch...@christopherschultz.net> wrote:

> Maxim,
>
> On 1/23/25 9:31 PM, Maxim Solodovnik wrote:
> > from mobile (sorry for typos ;)
> >
> >
> > On Thu, Jan 23, 2025, 23:00 James Matlik <james.mat...@gmail.com> wrote:
> >
> >> It works!
> >>
> >> I've been able to test with a customer name of: ÀËÌÑàëíñøü / \ Ѐӿ 中さ 😀
> >> customer
> >> This covers
> >>   - Latin-1 Supplement characters are 1 byte: ÀËÌÑàëíñøü
> >>   - The / and \ slash characters are ASCII, but are encoded due to
> special
> >> meaning in a URL
> >>   - Cyrillic characters are 2 bytes: Ѐӿ
> >>
> >
> > These two are not Cyrillic :))
>
> Unicode says the lowercase version of Ѐ[1] is Cyrillic[2], and we are
> ignorant westerners :)
>
> Also kha seems definitely to be Cyrillic[3].
>

My bad :)
These two are not in russian, ukrainian or bulgarian


>  > These are:  Вау :))
>
> LOL
>

More extraordinary are: ЩЖЦЪ
I guess :)


> -chris
>
> [1] https://www.compart.com/en/unicode/U+00C8
> [2] https://www.compart.com/en/unicode/U+0450
> [3] https://www.compart.com/en/unicode/U+04FF
>
> >
> >   - Chinese and Japanese characters are 3 bytes: 中さ
> >>   - emoji characters are 4 bytes: 😀
> >>
> >> A client can request a URL path with the following:
> >>
> >>
> "/customers/customer/%C3%80%C3%8B%C3%8C%C3%91%C3%A0%C3%AB%C3%AD%C3%B1%C3%B8%C3%BC%20%2F%20%5C%20%D0%80%D3%BF%20%E4%B8%AD%E3%81%95%20%F0%9F%98%80%20customer"
> >>
> >> Then Tomcat processes the URL to the following and passes it into the
> >> servlet.
> >> "/customers/customer/ÀËÌÑàëíñøü %2F %5C Ѐӿ 中さ 😀 customer"
> >>
> >> So we are now able to save this test customer to the database and read
> it
> >> back out again. I tried this with a few other paths that are a little
> >> structurally different (with/without query string, escaped string in a
> >> middle path segment instead of just the end) and it all worked as
> desired.
> >>
> >> Thank you so much for working on this!
> >> -James
> >>
> >> On Thu, Jan 23, 2025 at 8:20 AM Mark Thomas <ma...@apache.org> wrote:
> >>
> >>> James,
> >>>
> >>> I've added attributes (encodedReverseSolidusHandling and
> >>> encodedSolidusHandling) to the Context to provide control of how the
> >>> RequestDispatcher paths are processed.
> >>>
> >>> Snapshots built after 12.00 UTC today should include the new
> attributes.
> >>>
> >>> As I type, Tomcat 12 is available, Tomcat 11 is building and Tomcat 10
> >>> and Tomcat 9 are in the queue. All should be complete in a couple of
> >> hours.
> >>>
> >>> Mark
> >>>
> >>>
> >>> On 22/01/2025 09:30, Mark Thomas wrote:
> >>>> This is going to be fun.
> >>>>
> >>>> The RequestDispatcher processing currently does not take account of
> >>>> encodedSolidusHandling or encodedReverseSolidusHandling.
> >>>> RequestDsipatcher URLs are processed using the defaults.
> >>>>
> >>>> I have a test case that demonstrates various forms of this problem. I
> >> am
> >>>> currently working on a fix.
> >>>>
> >>>> Mark
> >>>>
> >>>>
> >>>> On 21/01/2025 23:10, James Matlik wrote:
> >>>>> It got further, but ran into another security check within the
> >>>>> org.apache.catalina.core.ApplicationContext. I'm using the 10.1.35-
> >>>>> SNAPSHOT
> >>>>> version as that is closest to my current version, and easiest to test
> >>>>> as a
> >>>>> result. This is the block of code having trouble now (linking into
> >>>>> 10.1.34):
> >>>>> https://github.com/apache/tomcat/blob/10.1.34/java/org/apache/
> >>>>> catalina/core/ApplicationContext.java#L386-L396
> >>>>>
> >>>>> This code:
> >>>>> 1. takes the URL (e.g. /customers/customer/A%2FB%5CC)
> >>>>> 2. decodes it (e.g. /customers/customer/A/B\C)
> >>>>> 3. normalizes the decoded url with the replaceBackSlash parameter
> >>>>> hardcoded
> >>>>> to true, thus forcing the decoded '\' character to become a '/'
> >>> character
> >>>>> (e.g. /customers/customer/A/B/C)
> >>>>> 4. then it compares the decoded URL to the normalized URL, and if
> they
> >>>>> are
> >>>>> not equal, an IllegalArgumentException is created and a 500 response
> >> is
> >>>>> returned.
> >>>>>
> >>>>> It seems that this area of the code would need to be aware of the
> >>>>> encodedReverseSolidusHandling configuration as well.
> >>>>>
> >>>>> Thank you,
> >>>>> James
> >>>>>
> >>>>> On Tue, Jan 21, 2025 at 1:20 PM Mark Thomas <ma...@apache.org>
> wrote:
> >>>>>
> >>>>>> On 21/01/2025 14:15, James Matlik wrote:
> >>>>>>> Hello Mark,
> >>>>>>>
> >>>>>>> Yes, I would be available to test a snapshot if this gets
> >> implemented.
> >>>>>>
> >>>>>> Latest (as I type this) Tomcat 12 build with the new feature:
> >>>>>>
> >>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/
> >>>>>> tomcat/tomcat/12.0.0-M1-SNAPSHOT/tomcat-12.0.0-
> >>>>>> M1-20250121.171115-315.tar.gz
> >>>>>>
> >>>>>> The snapshots for the 11.0.x, 10.1.x and 9.0.x are still building.
> >>>>>> Generally, you'll find them under:
> >>>>>>
> >>>>>>
> >>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/
> >>>>>> tomcat/tomcat/
> >>>>>>
> >>>>>> Anything built after 1700 UTC today should have the fix. Make sure
> >> you
> >>>>>> look for the snapshot for the current dev version for each release
> >>>>>> branch.
> >>>>>>
> >>>>>> The Connector attribute is called encodedReverseSolidusHandling
> >>>>>>
> >>>>>> Let the list know how you get on.
> >>>>>>
> >>>>>> Mark
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> -James
> >>>>>>>
> >>>>>>> On Tue, Jan 21, 2025 at 8:17 AM Mark Thomas <ma...@apache.org>
> >> wrote:
> >>>>>>>
> >>>>>>>> On 18/01/2025 16:18, James Matlik wrote:
> >>>>>>>>> I agree with everything you have said. As the config options
> stand
> >>>>>> today,
> >>>>>>>>> the allowBackslash seems to implement part of
> >> encodeSolidusHandling.
> >>>>>>>>>
> >>>>>>>>> While encodeSolidusHandling supports:
> >>>>>>>>> * REJECT - Return 400 on encoded /
> >>>>>>>>> * DECODE - Decodes the /
> >>>>>>>>> * PASS_THROUGH - Passes the encoded / as is
> >>>>>>>>>
> >>>>>>>>> The allowBackslash supports two mostly analogous settings:
> >>>>>>>>> * false - REJECT
> >>>>>>>>> * true - DECODE, but it Decodes as / instead of \
> >>>>>>>>
> >>>>>>>> Not quite. Decoding is separate.
> >>>>>>>>
> >>>>>>>> Currently, %5C always gets decoded to \. allowBackslash then
> >>>>>>>> determines
> >>>>>>>> whether any \ (originally encoded or not) characters in the URL
> >>>>>>>> will be
> >>>>>>>> permitted and converted to / or the URL rejected.
> >>>>>>>>
> >>>>>>>> The decoding and conversion are partially independent. For
> example,
> >>>>>>>> while it may be required that an encoded sequence %5C passed
> >>>>>>>> through to
> >>>>>>>> the application, whether an unencoded \ is rejected or converted
> >>>>>>>> to / is
> >>>>>>>> an independent choice.
> >>>>>>>>
> >>>>>>>>     From a security point of view, it is necessary to always
> convert
> >>>>>>>> \ to
> >>>>>> /
> >>>>>>>> - at least on Windows anyway - else all sorts of nasty security
> >>> things
> >>>>>>>> could happen such as bypassing security constraints.
> >>>>>>>>
> >>>>>>>> We should be able to fix this for the next round of releases
> (early
> >>>>>>>> Feb). I am currently looking at whether a new option to
> effectively
> >>>>>>>> replace allowBackslash is sufficient or whether an new option is
> >>>>>>>> needed
> >>>>>>>> in addition to allowBackslash.
> >>>>>>>>
> >>>>>>>> My current thinking is that we should add
> >>> encodeReverseSolidusHandling
> >>>>>>>> with the same options as encodeSolidusHandling and leave the
> >>>>>>>> convert/reject decision for unencoded \ to allowBackslash.
> >>>>>>>>
> >>>>>>>>> IMO, an ideal implementation would introduce
> >>>>>> encodeReverseSolidusHandling
> >>>>>>>>> to behave the same as encodeSolidusHandling, but maybe add a
> >>>>>>>>> DECODE_AS_SOLIDUS. Then the allowBackslash setter would set the
> >>>>>>>>> encodeReverseSolidusHandling variable to DECODE_AS_SOLIDUS or
> >> REJECT
> >>>>>>>> (true
> >>>>>>>>> vs false) for backward compatibility, but should probably be
> >>>>>> deprecated.
> >>>>>>>>> The code that normalizes the URL can remove any special handling
> >> of
> >>>>>> the \
> >>>>>>>>> character because all enforcement can be applied within UDecoder
> >>>>>>>>> as is
> >>>>>>>> done
> >>>>>>>>> with the / character today.
> >>>>>>>>
> >>>>>>>> I don't think DECODE_AS_SOLIDUS is necessary although it could
> >>>>>>>> always be
> >>>>>>>> added as an additional option later if necessary.
> >>>>>>>>
> >>>>>>>>> This would make the handling of \ more consistent with /, and I
> >>>>>>>>> imagine
> >>>>>>>> the
> >>>>>>>>> security concerns between the two remain about equal. Though,
> like
> >>>>>>>>> you
> >>>>>>>>> said, others on the team have already thought deeply about this,
> >> so
> >>> I
> >>>>>> may
> >>>>>>>>> be missing something important.
> >>>>>>>>>
> >>>>>>>>> If I were to pursue getting a change like this made to Tomcat,
> >> where
> >>>>>>>> should
> >>>>>>>>> I start?
> >>>>>>>>
> >>>>>>>> This is the way to start the process. A discussion on the users
> >> list
> >>>>>>>> that clearly explains the problem and what a solution might look
> >>> like.
> >>>>>>>>
> >>>>>>>>> I know this would be a slow process.
> >>>>>>>>
> >>>>>>>> There is a good reason for the feature, the implementation looks
> to
> >>> be
> >>>>>>>> backwards compatible and not too invasive, there isn't simple
> >>>>>>>> extension
> >>>>>>>> point you can use to implement and one or more committers are
> >>>>>>>> interested
> >>>>>>>> in solving the problem then there is a good chance it will be in
> >> the
> >>>>>>>> next round of releases (early Feb).
> >>>>>>>>
> >>>>>>>>> Or, if I'm better off working around the core functionality,
> would
> >>>>>>>>> you
> >>>>>>>> have
> >>>>>>>>> any suggestions on how? I see the UDecoder recently changed to
> >>>>>>>>> support
> >>>>>>>>> encoded % characters. I considered using a double encoded \ hack
> >> to
> >>>>>>>>> effectively pass through, but that won't work now. Plus, I wasn't
> >>>>>>>>> able
> >>>>>> to
> >>>>>>>>> figure out how to add custom logic before the URL decode and
> >>>>>>>>> normalize
> >>>>>> to
> >>>>>>>>> add my work around encoding on the server side, as doing so in
> the
> >>>>>> client
> >>>>>>>>> side isn't feasible.
> >>>>>>>>>
> >>>>>>>>> Ideally, I wouldn't need to maintain a custom build of Tomcat
> >>>>>>>> indefinitely.
> >>>>>>>>
> >>>>>>>> There isn't an easy (or any) extension point to implement this. It
> >>>>>>>> would
> >>>>>>>> have to be a custom Tomcat build.
> >>>>>>>>
> >>>>>>>> Are you able to test some snapshot builds if this gets
> implemented?
> >>>>>>>>
> >>>>>>>> Mark
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks, James
> >>>>>>>>>
> >>>>>>>>> On Fri, Jan 17, 2025, 10:00 AM Christopher Schultz <
> >>>>>>>>> ch...@christopherschultz.net> wrote:
> >>>>>>>>>
> >>>>>>>>>> James,
> >>>>>>>>>>
> >>>>>>>>>> On 1/17/25 8:04 AM, James Matlik wrote:
> >>>>>>>>>>> When I'm talking about path parameters, it is in the context of
> >>> how
> >>>>>>>> Open
> >>>>>>>>>>> API/Swagger defined them:
> >>>>>>>>>>>
> >> https://swagger.io/docs/specification/v3_0/describing-parameters/
> >>>>>>>>>>
> >>>>>>>>>> Okay, that helps clear things up. In the URL specification
> >>>>>>>>>> (inherited
> >>>>>> by
> >>>>>>>>>> HTTP) defines them differently:
> >>>>>>>>>>
> >>>>>>>>>> https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3
> >>>>>>>>>>
> >>>>>>>>>> The term "path parameter" you are using is actually a "path
> >>> segment"
> >>>>>> in
> >>>>>>>>>> URL/HTTP terms.
> >>>>>>>>>>
> >>>>>>>>>> The problem you are running into is that encoded slashes are
> >> often
> >>>>>> used
> >>>>>>>>>> to attack servers by performing path-traversals or avoiding
> >>> security
> >>>>>>>>>> constraints. Let's say you are on Windows and there is a
> security
> >>>>>>>>>> constraint that says "only admin users can access files in
> >>>>>>>>>> D:\docroot\(ON\QC)". You have mapped URL-/ to
> >> physical-D:\docroot.
> >>>>>>>>>> (Admittedly this is very contrived).
> >>>>>>>>>>
> >>>>>>>>>> The attacker says "well, %29 is a backslash, so lemmie just
> check
> >>> to
> >>>>>> see
> >>>>>>>>>> if I can slip one past the security system" and requests
> >>>>>>>>>> http://localhost:8080/group/%28ON%5CQC%29%20LOCAL. If the
> server
> >>> is
> >>>>>>>>>> fooled, the attacker gets to request that resource in that
> >>>>>>>>>> directory.
> >>>>>>>>>>
> >>>>>>>>>> My guess is that Tomcat's "URL cleaner" is seeing that \,
> >>> attempting
> >>>>>> to
> >>>>>>>>>> "sanitize" it, and converting it to a / because that's the local
> >>>>>>>>>> filesystem path separator. This makes security constraints much
> >>> more
> >>>>>>>>>> difficult to bypass.
> >>>>>>>>>>
> >>>>>>>>>> My naïve reading of your desire to have
> >>>>>>>>>> encodeSolidusHandling=PASS_THROUGH behave the same for / and \
> >>> seems
> >>>>>>>>>> reasonable if a misnomer. I say "naïve" because others on this
> >> team
> >>>>>> have
> >>>>>>>>>> spent countless hours of their lives mulling these things over
> >> and
> >>>>>>>>>> getting the logic "just right".
> >>>>>>>>>>
> >>>>>>>>>> I say it's a misnomer because "solidus" means literally / and
> not
> >>> \.
> >>>>>>>>>> encodeReverseSolidusHandling? encodeAntiSolidusHandling?
> >>>>>>>>>>
> >>>>>>>>>> The encodeSolidusHandling configuration setting is there so we
> >> can
> >>>>>> have
> >>>>>>>>>> a very secure default-configuration that prevents attacks on
> >> naive
> >>>>>>>>>> applications. In your case, you have an application that
> >>>>>>>>>> (presumably)
> >>>>>>>>>> knows exactly what it is doing, and therefore you have the
> option
> >>> to
> >>>>>>>>>> pass-through those solidus...es. Solidii? Whatever.
> >>>>>>>>>>
> >>>>>>>>>> I think that actually having a separate configuration setting
> for
> >>>>>>>>>> antisolidii handling makes more sense than having that existing
> >>>>>> setting
> >>>>>>>>>> perform double-duty, since each one of those you allow to pass-
> >>>>>>>>>> through
> >>>>>>>>>> opens the door slightly wider to an attacker.
> >>>>>>>>>>
> >>>>>>>>>> -chris
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>> ---------------------------------------------------------------------
> >>>>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>>>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>
> >>>
> >>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to