Hi everyone, thank you in advance for reading.

I was trying to use EscapedPath() to do path-based routing in an HTTP 
server, where some requests need to be forwarded to another service. I 
found that an invalid character in some parts of the path of the Request 
URI passed in the HTTP Request Line would not cause an error, but instead 
would be silently accepted and when I had to forward the request then I 
would be forwarding something I wasn't expecting to.

When the path contains all valid path characters then using EscapedPath() 
and String() methods work as (I would have) expected, and I can do things 
like path.Clean to remove "..", ".", and "/////" from the path. But if I 
put something like "Ñ" in the path, then EscapedPath() (and String(), as it 
calls EscapedPath()) will interpret other %-encoded sequences in the passed 
path. So if I send an HTTP request like:

HTTP /x%2Fx HTTP/1.1
Host example.com

Then EscapedPath() will return "/%2F", which is to say a path with a single 
path element: "x/x". But if I send a request like:

HTTP /Ñ%2FÑ HTTP/1.1
Host example.com

Then EscapedPath() will return "/%C3%91/%C3%91", which is to say a path 
with two path elements: "Ñ" and "Ñ", whereas I would expect a parsing error 
and no handler being called. Note also that my "%2F" was interpreted. When 
forwarding the request to another service, this could be a problem because 
it's interpreting user input and the user could potentially traverse the 
proxied service.

Here is a small Playground to illustrate: https://go.dev/play/p/ySJwVtvHHQF

I thought of a few workarounds on the language-user side:

   1. Manually writing a validator to know if the path is valid and return 
   an error. Prone to error.
   2. "go:linkname ...validEncoded" instead of writing my own. Definitely 
   not on my plans.
   3. Set RawPath to the empty string, and live with the fact that the user 
   can send a %2F and that it will be interpreted as a literal "/" separating 
   path elements, but as long as every part of the program and all systems 
   that do something different based on the path see the same thing, then 
   that's probably better.


I read the code in Chi and Gin router libraries to see if others had the 
same problem, and both of them appear to have related issues reported, but 
so far it appears to be a contentious or otherwise unresolved subject. The 
following are probably related:

   - https://github.com/go-chi/chi/issues/641
   - https://github.com/go-chi/chi/issues/642
   - https://github.com/go-chi/chi/issues/832
   - https://github.com/gin-gonic/gin/issues/4033

Here is also a related reddit thread: 
https://www.reddit.com/r/golang/comments/1gbcgpx/confused_about_urlpath_urlrawpath_and/?rdt=64426

Note that I took the assumption on what is "valid" based on my 
interpretation of RFC 3986 "Uniform Resource Identifier (URI): Generic 
Syntax" §3.3 (https://www.rfc-editor.org/rfc/rfc3986#section-3.3), which is 
also mentioned in the source code of the package.

Please, let me know your thoughts, if I'm missing anything, or if there are 
better alternatives.

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/8099a95f-d1a0-49d3-80d2-5e6bc703f0f0n%40googlegroups.com.

Reply via email to